Computer  Science  Department 


TECHNICAL  REPORT 


On  Shape  Optimizing  the  Ratio  of  the  First  Two 
Eigenvalues  of  the  Laplacian 


Jean-Pierre  A.  Haeberly 
Technical  Report  586 

October  1991 


NEW  YORK  UNIVERSITY 


r4 


<  ^ 


is 
00    <D 

in 
I 

K 
Eh 


N 


1  e 

C  -H 

M     0)  CU 

u  r:  o 
en 

o  x; 

X!  w 

>H    (TS  C 

ti;  c 


o 
o 


o 

-P    0) 
CO  jn 

»-l    4-1 
•H 

U-l  M-l 
O 

CD 

-P   o 

D 

in  I-) 

O    (C 

> 

o  c 

4->    tjl  I 
i~t    CD 


Department  of  Computer  Science 
Courant  Institute  of  Mathematical  Sciences 

251  MERCER  STREET,  NEW  YORK,  N.Y.  10012 


W   m   m 

m  :m  'm  4 


4i    m 


On  Shape  Optimizing  the  Ratio  of  the  First  Two 
Eigenvalues  of  the  Laplacian 


Jean-Pierre  A.  Haeberly 

Technical  Report  586 

October  1991 


On  Shape  Optimizing  the  Ratio  of  the  First 
Two  Eigenvalues  of  the  Laplacian 


Jean-Pierre  A.  Haeberly 


October  1991 


A  dissertation  in  the  Department  of  Computer  Science  submitted  to  the 
faculty  of  the  Graduate  School  of  Arts  and  Science  in  partial  fulfillment  of 
the  requirements  for  the  degree  of  Masters  of  Science  at  New  York  University. 


Approved 


(Signed) 


Michael  Overton 
Thesis  Advisor 


Abstract 

We  investigate  numerically  a  1956  conjecture  of  Payne,  Polya,  and  Wein- 
berger. The  conjecture  asserts  that  the  ratio  of  the  first  two  eigenvalues  of 
the  Laplacian  on  a  bounded  domain  il  of  the  plane  with  Dirichlet  boundary 
conditions  reaches  its  minimum  value  precisely  when  fi  is  a  disk.  A  crucial 
feature  of  this  problem  is  the  loss  of  smoothness  of  the  objective  function 
at  the  solution.  The  following  results  form  the  core  of  our  numerical  treat- 
ment. First,  we  construct  finite  dimensional  families  of  deformations  of  a 
disk  equipped  with  a  uniform  triangulation.  This  permits  the  formulation 
of  a  discrete  model  of  the  problem  via  finite  element  techniques.  Second,  we 
build  on  the  work  of  M.  Overton  to  derive  optimality  conditions  in  terms 
of  Clarke's  generalized  gradients  for  nonsmooth  functions.  These  ideas  are 
then  combined  into  an  algorithm  and  implemented  in  Fortran. 
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1      Introduction 

In  1956  a  conjecture  was  formulated  by  Payne,  Polya  and  Weinberger  con- 
cerning the  ratio  of  the  two  smallest  eigenvalues  of  the  Laplacian  on  bounded 
regions  Q.  of  the  plane  [18].  More  specifically,  if  Xi(tt)  <  A2(fi)  denote  these 
eigenvalues,  then  they  conjectured  that  the  ratio,  A2/A1,  attains  its  maxi- 
mum precisely  when  ft  is  a  disk.  Recently,  while  the  present  work  was  still 
in  progress,  Ashbaugh  and  Benguria  have  given  a  proof  that  the  conjecture 
is  indeed  true  [1,  2]. 

In  this  thesis  we  investigate  this  problem  from  a  different  perspective, 
namely  that  of  numerically  minimizing  this  ratio  for  an  appropriate  dis- 
cretized  model.  Optimization  problems  involving  eigenvalues  are  particu- 
Icirly  interesting  and  challenging  because  the  optimization  objective  often 
forces  some  eigenvalues  to  coalesce  at  the  solution  point  and  this  results  in 
a  loss  of  smoothness.  This  is  precisely  what  occurs  in  the  above  problem 
since  the  second  eigenvalue  of  the  Laplacian  on  a  disk  has  multiplicity  two. 
Methods  for  the  optimization  of  the  largest  eigenvalue  of  a  symmetric  matrix 
that  can  handle  the  lack  of  smoothness  at  the  solution  have  recently  been 
devised  by  Michael  Overton  [16,  15]  who  implemented  them  in  an  algorithm 
which  has  been  successfully  applied  to  an  extensive  collection  of  problems 
[16,  8].  We  extend  these  techniques  to  apply  in  our  context. 

The  main  steps  are  as  follows.  First  we  use  finite  elements  to  describe  a 
family  ft(i)  of  perturbations  of  a  disk  of  radius  R.  Here  x  lies  in  a  bounded 
open  neighborhood  U  of  (1,...,1)  in  S?"*,  and  Q(l,...,l)  is  our  approxi- 
mation of  the  disk.  Computing  the  eigenvalues  of  the  Laplaxuan  on  il{x)  is 
then  reduced  to  computing  the  eigenvalues  of  the  symmetric  definite  pen- 
cil {A{x),B{x)),  where  A{x)  is  the  stiffness  matrix  and  B{x)  is  the  mass 
matrix  corresponding  to  the  given  triangulation  of  ^(x)  [19].  Next  we  ex- 
tend the  work  of  Overton  and  Womersley  [17]  to  sums  ^f=i  A,  of  the  first 
k  eigenvalues  of  symmetric  pencils  {A,B).  More  precisely,  we  give  a  char- 
acterization of  the  generalized  gradient  of  ^i_i  A,  as  a  function  of  both  A 
and  B.  We  then  apply  this  result  to  obtain  a  description  of  the  general- 
ized gradient  J2'i=i  ^ii^)i  the  sum  of  the  first  k  eigenvalues  of  A{x),B{x). 
This,  combined  with  Clarke's  calculus  of  generalized  gradients  [7],  allows 
us  to  derive  a  characterization  of  the  generalized  gradient  of  the  function 
p{x)  =  — (Ai(x)  +  A2(i))/Ai(x).  This  is  the  function  that  we  wish  to  mini- 
mize. Indeed,  minimizing  p(x)  is  equivalent  to  maximizing  A2(x)/Ai(i).  We 
are  then  able  to  derive  optimality  conditions  for  a  minimizer  of  p{x)  and  to 
formulate  the  appropriate  variant  of  Overton's  algorithm. 


The  thesis  is  organized  as  follows.  We  be^n  with  a  few  preliminary 
remarks  about  eigenvalues  of  symmetric  pencils  in  section  2.  Then,  in  sec- 
tions 3  and  4,  we  recall  the  results  of  Overton  and  Womersley  [17]  men- 
tioned above.  In  section  5  we  show  how  this  leads  to  a  characterization  of 
the  generalized  gradient  of  the  sum,  ^JLj  A,,  of  the  first  k  eigenvalues  of  the 
symmetric  pencil  iA,B)  by  considering  the  symmetric  matrix  G~^  AG~^ , 
where  G  is  the  Choleski  factor  B.  Then,  composing  the  function  5Z:=i  \ 
with  the  smooth  function  associating  {A{x),B{x))  to  x  e  U ,  we  give  a  de- 
scritption  of  the  generalized  gradient  of  X2f=i  ^i^)  ^  ^  function  of  x  in 
section  6.  Next  we  compute  the  generalized  gradient  of  the  function  p{x) 
in  section  7,  while  in  section  8  we  digress  briefly  to  derive  the  generalized 
gradient  of  the  related  function  t]{x)  =  A2(i)  -  ^i(x),  measuring  the  gap 
between  A2  ajid  Ai .  Optimality  conditions  for  a  minimizer  of  p{x)  are  given 
in  section  9,  and  the  construction  of  our  family  of  perturbations  of  a  disk  is 
given  in  section  10.  Our  modified  version  of  Overton's  algorithm  is  described 
in  section  11.  Finally  a  discussion  of  the  numerical  data  generated  by  the 
algorithm  is  presented  in  section  12. 

We  wish  to  thank  our  advisor.  Prof.  M.  Overton,  for  introducing  us 
to  this  material  and  for  his  help  and  kindness  (and  patience!)  during  the 
preparation  of  this  thesis. 


Notations. 

Sff"*"  denotes  the  vector  space  of  n  x  n  real  matrices. 

«SR"  is  the  subspace  of  symmetric  nxn  matrices,  C"  the  group  of  orthogonal 

n  X  n  matrices  and  Q"  the  set  of  positive  definite  symmetric  matrices. 

53ff"  is  endowed  with  an  inner  product,  <  ,  >,  defined  by 


<M,N  >=^Tnijnij  =  TrM^N, 


»j 


for  M  =  {m,j},  N  =  {n.^}  in  5»". 

K  /li  is  an  n  X  r  matrix  and  A2  is  an  n  X  s  matrix,  A  =  [Ai  :  A^]  denotes 

the  nx  {r  -\-  s)  matrix  obtained  by  juxtaposing  the  columns  the  Ai  and  A2. 


2     Preliminaries 

We  collect  in  this  section  some  well  known  faicts  for  future  reference. 

First  observe  that  53?"  xQ"  is  a  convex  subset  of  5l?"x53(?".  Indeed  it 
is  obvious  that 

M,N  £  SlSr  =>  AM  +  (1  -  \)N  e  53fr  ,VA  €  » 

M,7V  e  Q"  =>  AM  +  (1  -  A)^  G  Q",VA  e  [0,1]. 

Lemma  2.1  A  symmetric  matrix  B  is  positive  definite  if  and  only  if  all 
its  eigenvalues  are  positive.  It  is  positive  semi-definite  if  and  only  if  all  its 
eigenvalues  are  nonnegative. 

Proof:  Let  Ai  >  Aj  >  . . .  >  A„  denote  the  eigenvalues  of  the  symmetric 
matrix  B.  There  exists  an  orthogonal  matrix  Q  such  that 

Q'^BQ  =  diag{Xi,..,,Xn}  =  A. 

The  Raleigh  characterization  of  A„  gives 

.    x'^Bx 
At,  =  min  — ~ — 

i#0     x^  X 

Then 

x'^Bx  >  0,  Vi  7^  0  «J=»  A„  >  0 

and 

x'^Bx  >  0,  Vx  <=>  A„  >  0.  D 

Lemma  2.2  Let  V  =  {u,j}  be  a  positive  semi-definite  symmetric  matrix. 
Then  we  have 

(i)  vu  >  0,  V: 

{ii)vfj  <  ViiVjj,  Vi,j. 

Proof:  (i)  This  is  trivial.  Take  x  to  be  the  vector  e,  =  (0, . . . ,  1, .  •  •  ,0) 
with  1  in  the  i"*  position.  Then 

Vii  =  x'^Vx  >  0. 


(ii)  ^ven  i  <  j,  consider  the  vector  i  with  Xi  =  A,  Xj  =  /x  and  Xfc  =  0 
for  aU  Jfc  7^  i.  Then 

x'^Vx     =     Yl  ^kXlVkl 
k,l 

=    x'iVii  +  x]vjj  +  2xiXjVij 
=    A^Uij  +  /i^Vjj  +  2XfiVij 

Hence  we  must  have 

X'^vii  +  n^vjj  +  2XfiVij  >  0,V  A,  M  (2.3) 

Viewing  (2.3)  as  a  polynomial  in  A  with  /i  fixed,  we  see  that  the  discriminant 
must  be  <  0  .  Thus 

4(1^ vfj  -  4ii^ViiVjj  <  0,V/i 

and  the  conclusion  follows.  D 

Given  a  symmetric  definite  pencil  {A,B),  the  spectrum  A{A,B)  may  be 
expressed  in  several  equivalent  ways.  Let  G  denote  the  Choleski  factor  of  B, 
that  is  C  is  a  positive  definite  lower  triangular  matrix  such  that  B  =  GCr^  . 

Lemma  2.4   The  following  sets  are  equal. 

1.  A{A,B) 

2.  A(B-M) 

3.  A(G-MG-^) 

Proof:  It  is  obvious  that  A(  A,  B)  =  A(5"^  A),  so  we  show  that  A(  A,  5)  = 
A{G~^AG~^).  Let  A  €  A{A,B)  and  let  x  be  an  eigenvector  for  A.  Write 
y  =  G^x.  Then 

Ax  =  XBx    ^^=>    Ax  =  XGG'^x 
^^    AG-^y  =  XGy 
^^     G-^AG-'^y  =  Xy 

Thus  we  have  proved  that  A  €  A{A,B)  with  eigenvector  x  if  and  only  if 
A  6  A{G~^AG~^)  with  eigenvector  y  =  G'^x.  □ 


Remark  2.5  The  advantage  of  viewing  A{A,B)  as  A{G~^AG~^)  rather 
than  as  A{B~^A)  is  that  G~^AG~^  is  still  a  symmetric  matrix.  On  the 
other  hand,  the  dependence  upon  the  data  (A,  B)  is  by  far  more  explicit  in 
the  matrix  B~^ A  than  in  G"MG"^. 

We  can  build  upon  the  proof  of  lemma  2.4  a  bit.  Suppose  B  £  Q"  with 
Choleski  factor  G.  Let  Ai  denote  the  set  of  orthonormaJ  bases  of  5f"  (i.e.  the 
set  of  orthonormal  matrices),  and  let  A2  denote  the  set  of  5-orthonormal 
bases  of  R",  i.e.  bases  {vi , . . . ,  i;„}  such  that 

vJBvj  =  6ij,  for  1  <  i,j  <  n. 

Lemma  2.6    The  matrix  G  induces  a  bijection  of  Ai  onto  A2  defined  as 

{tJi,...,t;„}  >-*  {G~'^vi,...,G~^Vn} 

Proof:  Given  {uj , . . . , r„}  ^  A\.  Then 

{G-'^  vif  B{G-'^  vj)    =    vf{G-^BG-'^)vj 

T 

=       Vi    Vj 

=     Sij.D 

Now  let  M  €  <S3ff".  We  recall  a  few  facts  about  the  eigenvalues  A,  of  M. 
Write  :  Ai  >  ...  >  A„,  and  let  Q  G  Q"  be  an  orthogonal  matrix  whose 
columns  gi, . . .  ,9n  form  an  orthonormaJ  basis  of  eigenvectors  of  M,  ordered 
so  that 

Q^MQ  =  A  =  diag{Xi,...,Xr,}. 

For  A;  >  0,  write 

nk  =  {ve^\  v^qi  =  0, 1  <  t  <  it} 

and  let  Hq  =  JJ".   Then  we  have  the  Raleigh  quotient  characterization  of 

Ait: 

iF  Mv 
Xk  =       max        — = — . 

v€//)k_i,t//0      v^  V 

Next  let  Ljn  denote  the  subspace  of  5f"  spanned  by  a  set  of  m  orthonormal 
vectors  {vi , . . . ,  r^}.  We  define  the  Raleigh  trace  of  M  on  Lm  by 

m 

Tr[Lm]  =  Y.vjMvi. 
1=1 


Observe  that  Tr[Lm]  is  independent  of  the  choice  of  an  orthonormal  basis 
of  the  subspace  Lm-  Indeed,  let  V  =  [vi  . .  .Vm]  so  that  V^V  =  /  and 

Tr[Lm]  =  Tr{V'^MV). 

Any  other  orthonormal  basis  {wi,...,Wjn}  of  Lm  with  W  =  [wi  ...Wm]  is 
such  that 

W  =  VA, 

for  some  orthonormal  m  x  m  matrix  A.  It  follows  that 

Tr{A^V'^MVA)  =  Tt{V'^  AV)  =  Tr[Lm]. 

Lemma  2.7   (fSj)  We  have  the  following  characterization  of  the  sum  of  the 
k  eigenvalues  Xm+i ,■■■, >^m+k  • 

m+k 

Yi    A.  =    m.^  Tr[Lk]. 

Proof:  Let  Lk  C  H^  be  fixed.  Then  for  any  u  €  Xjt,  we  have  v^qi  =  0, 
for  1  <  i  <  m.  Since  dim  Lk  =  k,  there  must  exist  a  Vk+m  €  Lk  such  that 

vJ+mQi  =  0,  for  I  <  i  <  k  +  m  -  1. 

Hence  Vk+m  €  Hk+m-\  ■,  aJid  it  follows  that 

Similarly,  we  construct  Um+j  €  ifc  n  Hjn+j-i  ioi  1  <  j  <  k  such  that 

^m+j^m+j  =  1 


and 

Thus  Am+j  >  v^^jMvm+j  and  it  follows  that 


Um+ji'/  =  0,  for  m  +  j  +  l<l<m  +  k. 


m+k 

Yi    Xi>Tr[Lk]. 

t=m+l 

On  the  other  hand,  if  we  take  Lk  to  be  the  span  of  Qm+i ,  •  •  • ,  <lm+k,  then  we 
have 

m+k 

Y    X,  =  Tr[Lk].a 

i=m+l 


3      Max  characterization  of  Ef=i  A,-  for  symmetric 
matrices 

Let  Af  be  an  arbitrary  element  of  «SK"  with  eigenvalues  Ai  >  ...  >  A„. 
Then  lemma  2.7  implies  that 

k 
Ef^i  A.  =  max  Y^  vjMvj 

where  the  max  is  taken  over  all  sets  of  k  orthonormal  vectors  {vi,...  ,Vk}- 
Now 

k  k 

Y^vJMvj     =    Y<VjvJ,M  > 

k 


Let  us  write 


s!^   =   {vesisr 


3=1 


where  {uj, . . . ,  Ujt}  runs  over  all  orthonormal  sets  of  k  vectors. 
Then 

ELiA.  =  max  <V,M>-  (3.1) 

We  now  provide  another  characterization  of  X^JLj  A,.  We  let 

S^  =  {UeS^\0<U<I,  TtU  =  k} 

and 

V^  =  {D  e  SW  I  D  is  diagonal  ,0  <  D  <  I,TrD  =  k}. 

Lemma  3.2  S^  =  {PDP'^  \D  eV^,  P  e  (T) 

Proof:  It  is  obvious  that  PDP'^  €  5^  for  any  D  e  2>2  and  P  G  C?". 
Indeed,  the  partial  ordering  <  on  K"'"*  is  preserved  by  conjugation  by  or- 
thogonal matrices,  so  that  * 

0<D</.j=>0<  PDP"^  <  I. 
8 


Also 

Tr{PDP'^)  =  Tr{DP'^P)  =  TrD  =  k. 

Conversely,  given  any  (7  6  «?* ,  a  spectral  decomposition  of  U  yields 

U  =  PDP"^ ,  P  orthogonal,  D  diagonal, 

i.e.  the  entries  of  D  are  the  eigenvalues  of  U  and  the  columns  of  P  are  the 
corresponding  eigenvectors.  Then,  as  before,  we  get 

Q<U  <I  =>Q<D  <I 

TtU  =  ik  =>  TrD  =  k 
and  we  conclude  that  Z)  G  D*-  '-' 

Lemma  3.3  X>2  '*  °  retract  of  S^  ,  i.e.  D*  C  S2  and  there  is  a  map 

r:S^  — ►  V^ 

such  that  r{D)  =  D  for  all  D  €  P* • 

Proof:  Given  U  £  S^,  define  r{U)  to  be  the  diagonal  matrix  consisting 
of  the  diagonal  entries  of  U,  i.e. 

r(U)-.  =  l  "•■'    '^'=-' 
^    ^*^      \  0       otherwise 

It  is  obvious  that  r{D)  =  D  for  D  6  V^.  We  check  that  r{U)  e  V\.  r{U) 
is  diagonal  by  construction  and  dearly  Tr{r{U))  =  Tr{U).  Moreover,  by 
lemma  2.2,  we  have 

U>0=>uii>0,  Vt 

U  <I    ^^^    I-U  >0 

=>     1  -  Uii  >  0,  Vi 
=^     u„  <  1,  Vi 

Hence  :  0  <  r(f/)„  <  1  for  all  i  =  1,. . .  ,n.  But  since  r((/)  is  diagonal,  this 
implies  that  0  <  r{U)  <  /.  □ 

Now  let  Q  €  O"  be  such  that  Q'^MQ  =  A,  where  A  =  diag  {A, , . . . ,  A„}. 


Remark  3.4  Even  with  the  order  of  the  eigenvalues  fixed  the  matrix  Q 
is  not  uniquely  determined,  in  general,  due  to  possible  eigenvalues  with 
multiplicity  greater  than  1.  Indeed,  Q  may  be  replaced  by  QP  for  any 
P  €  O"  such  that  P^AP  =  A.  But 

(AP)0=^A..P,j  =  A.P.j 

s 

{PA)i,='£PuAt,  =  XjP,, 
t 

so  that 

AP  =  PA==>  Pij  =  0  for  all  ij  with  A,  ^  A_,. 

Hence  P  is  block  diagonal,  P  =  diag  {P\  ,...,/*„,),  where  m  is  the  number  of 
distinct  eigenvalues  of  M.  The  size  of  Pi,  for  1  <  /  <  m,  is  the  multiplicity 
of  the  corresponding  eigenvalue,  and,  of  course,  P;  is  orthogonal. 

It  is  clear  that 

Ei=i  A.  =  max   <  I>,A  >  .  (3.5) 

Since  r(52 )  =  P2'  *^  \idL\e 

max  <  D,A  >=  max  <  r(f/),A  > 

and,  as  A  is  diagonal,  <  r([/),A  >  =  <  U,A  >.  Thus 
Ef=i  Ai     =     max  <U,A> 

=     max  <  U,Q^MQ  > 

=     max  <  QUQ^.M  > 

?2' 


ueS^ 


Furthermore,  observe  that  the  map 

induces  a  bijection  of  5|  onto  itself,  with  inverse 

V  -^  Q^VQ 

10 


and  it  follows  that 

YlLi>^  =  max  <V,M  >  .  (3.6) 

Next  we  determine  which  elements  U  €  S2  realize  the  maximum 

max  <V,M  >  . 

This  requires  more  information  about  the  eigenvalues  of  M.  Let  us  assume 
that  the  A;"'-eigenvalue  has  multiplicity  t,  so  that 

Ai  >  . . .  >  A,  >  A^+i  =  . . .  =  Ait  =  . . .  =  K+t  >  K+t+i  >  ■■■>  K     (3.7) 

where  r  >  0.  It  is  clear  that  those  D  £T>^  which  realize 


max  <  Z?,A  > 

1* 


deP.* 


are  of  the  form  D  —  diag{Di ,  Z?2,  ^3)  where  the  diagonal  matrices  Di,D2,  D3 
satisfy 

Di  =  It,  the  identity  r  x  t  matrix 

Z?3  =  0,  the  zero  {n  —  r  —  i)x{n  —  r  -t)  matrix 

D2  €  5»',  with  0  <  Z)2  <  h,  TtD2  =  k-r. 

Let  us  write  S2{t)  for  the  set  of  all  symmetric  t  x  t  matrices  M  of  trax;e  p 
with  0<  M  <It. 

Lemma  3.8   The  matrices  U   E  S2  for  which  r{U)  =  diag{D\,D2,Dz), 
D\,D2,Dz  as  above,  are  precisely  those  matrices  of  the  form 


It 

0    0  ■ 

0 

U2     0 

0 

0     0 

u  = 

where  U2  G  S2~^{t). 

Proof:  Clearly  any  such  U  €  S2  satisfies 

r{U)  =  diagiI,,D2,0) 
with  D2  =  '■([/j)  of  the  appropriate  type. 
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Now  suppose  that  U  £  S2  '\s  such  that  r{U)  =  diag{Ir,D2,0),  and  let  us 
write 

Un     U12     f/i3 

U=       U21     U22     Ui3 

U3\      U32      U33 

Note  that  if  a  matrix  V  €  3?"'*"  is  >  0,  then  so  is  any  principal  subraatrix 
V  of  V.  Indeed,  say  V  €  3?"""",  then  we  have  to  show  that  x'^Vx  >  0  for 
all  i  G  §?"*.  But  we  can  extend  z  to  a  vector  i  €  3?"  by  inserting  O's  in  the 
components  corresponding  to  the  rows  of  V  not  in  V,  and  we  get 

x'^Vx  =  x'^Vx  >  0. 

Then,  0  <  Z7  <  /  immediately  implies  that  0  <  U22  <  ^t-  Obviously, 
TrU22  =  k  —  r.  It  only  remains  to  show  that  Uu  =  Ir  and  all  other  blocks 
are  0.  But  this  is  an  easy  consequence  of  lemma  2.2.  Indeed  since  i7  >  0 
and  Ujj  =  0  if  J  >  r  +  f,  we  conclude  that 

Uij  =  0  =  Uji  foT  j  >  r  +  t,  i  <  j. 

Thus  t/31,  i732,  U33,  f/23,  Ui3  are  all  0.  Moreover,  since  V  =  I  -  U  >  0, 

Vii  =  0  for  1  <  X  <  r  and  u.j  =  «,_,  for  all  i  7^  j, 

we  conclude  that  Uji  =  Uij  =  Vij  =  0  for  1  <  i  <  r,  i  <  j.  It  follows  that 
Un  =  Ir  and  t^i2,f^i3,f^2i,  t^3i  are  all  0.  □ 

Now  recall  our  choice  of  Q  €  C?"  such  that  Q^MQ  =  A  and  write 

Q  =  [Qi:Q2:  Q3] 
where  Qi  €  O^'' ,  Q2  €  O"''  and  Q3  €  e?"-"""-'.  Thus 

QfMQi  =diag{Xi,.,.,Xr} 

QIMQ2  =  Xklt. 
By  lemma  3.8,  the  matrices  (/  G  .S*  realizing  max      ^^k  <  V,  A  >  are  of  the 

form 

"  /r       0      0 


U  = 


0     t^2     0 
0      0      0 
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with  U2£S^',0<U2<  It  and  TrU2  =  k  -  r.  Then 

QUQ''  =  Q,Qj  +  Q2U2Ql. 

Since  <  t/,A  >  =  <  U,Q'^MQ  >=<  QUQ'^,M  >,  we  have  proved  the  fol- 
lowing result. 

Theorem  3.9   The  matrices  U  £  S^  which  realize 

i:f=i  A,  =  max  <V,M  > 

are  of  the  form 

U  =  QiQl  +  Q2U2QI 

where  Q  =  [Qi  :  Q2  ■  Q3]  €  C?"  satisfies  Q'^MQ  =  A,  and  U2  €  .S»'  with 
0<U2<It  and  TrU2  =  k-r. 

We  have  obtained  three  characterizations  of  J2i=i  •^«so  far,  namely  equa- 
tions (3.1),  (3.5)  and  (3.6).  We  now  introduce  a  set  Pf  whose  relation  to 
Si  is  as  that  of  P*  ^^  «^2  •  Define 

Pf  =  {D  e  5§?"  I  D  is  diagonal,  TrD  =  k  and  da  =  0  or  1,  for  all  i  }. 

Lemma  3.10  Pj  is  the  convex  hull  of'D\. 

Proof:  Observe  that  the  number  of  nonzero  entries  of  a  matrix  Z>  €  Pf 
is  exactly  k  and  that  moreover  they  are  all  equal  to  1.  It  is  obvious  that  a 
convex  combination  of  elements  of  Pf  lies  in  P2.  So  we  need  to  show  that 
every  P  6  P2  is  a  convex  combination  of  elements  of  Pf . 
We  proceed  by  induction  on  the  number  1{D)  of  nonzero  entries  of  i?  e  P2  • 
Clccirly,  the  trace  condition  implies  that  1{D)  >  A:,  for  all  £)  6  Pj,  and  that 

i{D)  =  k^^^  D  ev\cv^. 

Obviously,  1{D)  <  n  for  all  D  e  P2. 

Now  suppose  we  have  proved  that  any  D  £  V^  with  1{D)  <  m,  (k  <  m  <  n), 
is  a  convex  combination  of  elements  of  Pf .  We  prove  that  the  same  holds  for 
D  6  P2  with  1{D)  =  m.  Let  /3  denote  the  smallest  nonzero  entry  of  D,  say 
/?  =  da  for  some  \  <  i  <  n  (Note  that  i  need  not  be  uniquely  determined). 
Define  ^  €  PJ'  be  such  that 

(t)      dii  =  1 

(n)    djj  =  1     ^     djj  >  0 
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(Again  there  may  be  many  choices  for  such  a  D).  Then  let 

D'  =  D-  fiD. 
Qearly  D'  is  diagonal  and  0  <  D'  <  I.  Moreover, 

Tr{D')  =  TrD  -  f3TrD  =  (1  -  /?)fc 
and,  since  (1  -  /?)  7^  0,  we  may  define 

Observe  that  I)  €  X'2  ^^^  ^^^^  '(^)  <  m  -  1  <  m,  so  that  by  induction,  we 
have 

t  a 

D  =  Y1  "j-^i'  *'*^  0  <  Qj  <  1,  51  "j  =  1  and  I>j  e  I^f . 


Thus  we  get 


D    =    D'  +  (3D 


1=1 

s 


s+l 


r=l 


where 


_  f  (l-/9)Qr     1  <  r<  5 
^'  "  \  ^  r  =  s  +  l 

f.     _\    Dr       1<T<S 

^'~\   b      r  =  5  +  1 


Clearly,  £>,.  6  Pf ,  for  1  <  r  <  s  +  1,  and  0  <  7r  <  1  with 

E7r       =      $:(1  -/?)«,+ /? 
r=l  r=l 

r=l 

=    (l-/3)  +  /3  =  l.D 
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Lemma  3.11  Si[  =  {PDP^  |  D  6  Pf ,  P  €  C?"  }. 

Proof:  Let  us  write  C  =  {PDP^  |  £>  €  2>f ,  P  e  (T).   Given  D  €  Pf , 
let  »!,...,  ifc  denote  the  indices  of  the  nonzero  entries  of  £>,  i.e. 

dtt  =  1  •<=>  t  =  ij,  for  some  1  <  j  <  A:. 

Given  P  6  C7",  write  P  =  [pi,. . .  ,Pn],  where  p,  denote  the  columns  of  P. 
Thus 

Pi  €  S",  pfpj  =  6ij,  for  1  <  X,  j  <  n. 

Then  PD  =  [  gi , . . . ,  9t»]  where 


^       I  0      otherwis 


=  ig  for  some  s 
se 


i.e.  the  columns  pj  with  j  j^  ii , . . . ,  i^  are  replaced  by  0  while  the  others  are 
preserved.  Hence  PDP^  =  Ylt=\PitPi  •  ^^  claim  that 

Indeed,  given  PDP^  and  ii,...,ifc  as  above,  let  Q  =  [p,, ,...,p,\]  G  C7"'*. 
Then  PDP"^  =  QQ^.  Conversely,  given  Q  =  [gi,.  ..,«7fc]  €  C?"'^  we  may 
write 

for  many  choices  of  P  and  Z).  For  example,  we  may  complete  {gi,...,gfc} 
to  an  orthonormal  basis  of  Jf",  say  {gi,  • ..  ,gjt,gfc+i,.- •  ,gn}  and  let  D  = 
dtcflr  {^1,.. . ,  1, 0, . . . , 0}  and  P  =  [gi , . . . , g„].  Thus  we  have 
k 


Sl^     =     {VeS^" 


k 
V  =  22  '^•^i  >  {'^i » •  •  • » ^Jt}  orthonormal  set  } 
1=1 

=    {y  €5K"|f  =  PDP^,  i?e2>f,  Pe  c?"}.n 

Lemma  3.12  S2  is  the  convex  hull  of  S^. 
Proof:  We  have 

^2^  =  {U  €  53?"  I  U  =  PDP'^,  D€V^,  PeO"} 

sl^  =  {V  e  53J"  I F  =  PDP^,  DeV^,  PeO"" }. 

The  result  follows  since  X>2  is  the  convex  hull  of  Pf  by  lemma  3.10.  n 
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Remark  3.13  Observe  that  the  elements  of  X>f  and  S^  have  rank  exactly 
k  while  those  of  V^  and  ^2  have  rsink  >  k. 

Remark  3.14  We  proved  earlier  the  existence  of  a  retraction 


r:S^ 


V'' 


However,  the  restriction  of  r  to  <Sf  does  not  yield  a  retraction  of  S^  onto 
X>f ,  but  maps  S^  into  Pj-  ^^  ™^y  summarize  the  relationships  between 
the  spaces  Pf ,  V2,  S^  and  S2  in  the  following  diagram. 


jjjt    convex  hull^  —^ 


convex  hull      ^ 


Given  a  symmetric  matrix  M  with  spectral  decomposition  Q^MQ  =  A,  we 
trivially  have  the  foUowing  characterization  of  J2,i=i  ^i- 


X)Li  -^t  =  ™a^  <  D,A> 


beVl 


Taking  the  convex  hull  of  V'l  gives 


max  <D,A>=  max  <D,A> 


Then 

max  <  Z),  A  >  =  max  <  t/,  A  >  , 

D€P^  l/€iS^ 

since  X>2  is  a  retract  of  ^2  and  since,  as  A  is  diagonal, 


<  t/,A>  =  <r({/),A> 
for  any  U  €  82-  Now,  since  5*  is  the  convex  hull  of  S^,  we  conclude  that 
max  <  V,  A  >=  max  <  t/,  A  >  . 

V€5i*  l/€52* 
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Finally,  since  S^   and  S2  are  invariant  with  respect  to  conjugation  by  or- 
tbonormal  matrices,  we  conclude  that 

Jli=i  ^  =  ™^  <V,M>=  max  <  U,M  >  . 
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4     The  generalized  gradient  of  E-=i  A,  :  53?"  — >  dt 

We  now  consider  XT^^i  A,  as  a  function  s^  defined  on  «SR", 

For  M  e  S^"  and  F  €  3?"''",  let  us  write 

Cv{M)=<V,M  >  . 
Then  £v  is  a  linear  functional  on  53?"  and  by  (3.6)  we  have 


^ki^f)  =  max  CuiM). 


ueS^ 


We  obtain  the  following  characterization  of  the  generalized  gradient  ds^  of 

3k. 

Theorem  4.1   For  M  e  SW  we  get 

dsk{M)  =  {[/  €  5»"  I  [/  =  QxQl  +  Q2U2QI  } 

where  the  eigenvalues  of  M  satisfies  (3.7), and  Q  =  [Qi  :  Q2  :  Q3]  6  C"  and 
U2  €  «S3J'  are  as  in  theorem  3.9. 

Proof:    By  [7,  theorem  2.8.6,  p.  92]  and  theorem  3.9,  we  know  that 
dsk{M)  is  the  convex  hull  of 

{UeS^^\U  =  QxQl  +  Q2U2Ql}. 

So,  we  only  need  to  observe  that  this  set  is  already  convex.  It  is  sufficient 
to  prove  that  the  set 

A  =  {V  eSlSt'\Q<V  <It,  TrV  =  p  }, 

for  some  fixed  p,  is  convex.  But  for  Vi ,  Vj  ^  A,  s  ^  [0, 1],  it  is  clear  that 

sVx  +(1-5)V^2  e^K* 

Q  <  sVx  +  {I  -  S)V2  <It. 

Moreover, 

Tr{sVi  +  (1  -  s)V2)  =  sTrVi  +  (1  -  s)TrV2  =  p.  □ 
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Remark  4.2  Consider  the  special  case  fc  =  1.    Then  r  =  0  in  (3.7)  and 
si{M)  =  Ai(M)  for  M  £  S^".  We  get 

dXiiM)  =  {V  eS^"\V  =  QiUQl,  U  €  5»",  0  <  f/  <  It,TrU  =  1  } 

where  Qi  €  O"-',  Qj MQi  =  Ai(M)/t.  Observe  that  the  condition  U  <  It 
is  redundant,  i.e. 

U  e  S^\  U>0,  TrU  =1   =»   f/  <  /,. 

Indeed,  let  {/ii, . . .  ,/xj}  denote  the  eigenvalues  of  U .  We  have  /z,  >  0  for  all 
i  since  t/  >  0.  Then 


1=1 
=*■     Mi  <  1.  for  all  i 

So  we  recover  the  characterization  of  d\\{M)  derived  in  [16]. 
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5     The  differential  of  ^^  :  "53?"  x  Q"  — >^ 

We  may  now  apply  the  results  of  section  3  to  a  symmetric  definite  pencil 
{A,B).  So  let  G  denote  the  Choleski  factor  of  B  and  let  Q  €  Q"  be  such 
that 

where  A  =  diag  (Aj , . . . ,  A„)  and  Ax  >  . . .  >  A„  are  the  eigenvalues  of  (A,  B). 
Let  X  =  G     Q,  so  that 

f  X^^BX  =  / 

\  X'^AX  =  A 

Let  us  write  gk  for  Yli=i  ^i  viewed  as  a  function  on  «SJ?"  X  Q"  and,  for 
V  e  3?"''",  let 

9v{A,B)  =<  V,G-^AG-'^  >  . 

Since  A(A,B)  =  A(G-MG-^)  and  G-^AG"^  is  symmetric,  we  get  (see  (3.6)) 

gk{A,B)=  max  ^v{A,B)  (5.1) 

and 

gkiA,B)=       max       <U,A>.  (5.2) 

Moreover,  theorem  3.9  provides  a  description  of  those  V  ^  S2  which  realize 
the  maximum.  Equation  (5.2)  is  not  a  very  useful  representation  of  gk{A,B) 
because  the  set  over  which  the  maximum  is  taken  depends  upon  B.  On 
the  other  hand,  in  order  to  use  equation  (5.1)  for  computing  generalized 
gradients  we  need  to  be  able  to  compute  the  derivative  of  the  Choleski  factor 
G  as  a  function  of  B.  Observe  that  the  functions  *v  :  53?"  x  Q"  — >  §?, 
while  linear  in  A,  are  not  linear  in  B. 

We  now  define  maps  a,  /3,   and  7  as  follows. 

a:     Q""     — ►      T" 
B      ^^      G 

where  T^  C  3?"**"  denotes  the  subset  of  lower  triangular  matrices  with  pos- 
itive diagonal  entries.  Let  V"  denote  the  linear  space  of  all  lower  triangular 
matrices.  Then  V"  =  R"<"+i)/2  ^^  ^^  ^j^^  jn  ^  ^^  ^^^^  subset  of  V". 
Similarly  Q"  is  an  open  subset  of  53?"  ^  3jn("+i)/2 

/3:    T"     — y     V 
G      y-^     G-i 
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Observe  that  the  inverse  of  an  element  of  7^  is  indeed  an  element  of  T". 

7  :    53?"  X  T"     ^      SlSr 
{A,L)        ^^     LAL^ 

Also,  recall  the  map  Cv  defined  in  section  4, 

Cv  :    53?"     — *  3?" 

M      ^-    <V,M> 

where  V  £  S2-  Then  we  have 

9v{A,  B)  =  £v(7  ((1  X  /9)  ((1  X  a)  {A,  B)))).  (5.3) 

Note  that  SW  x  Q"  is  an  open  subset  of  5R"  x  53J",  so  that  the  tangent 
space  to  SU"  x  Q"  at  a  point  {A,B)  is 

5K"   X  53?"  ^  »"("+^)/2  X  3}n(n+l)/2_ 

Similarly,  the  tangent  space  to  T"  at  a  point  G  is 

V"  ^  3U"("+i)/2. 

We  do,  however,  write  our  matrices  as  square  matrices;  otherwise  we  should 
modify  the  definition  of  the  inner  product  <  ,  >  as  follows  :  multiply  the 
terms  corresponding  to  the  entries  lying  below  the  diagonal  by  a  factor  of  2. 
Now  for  (5,1")  e  «S3J"  x  53?",  the  chain  rule  gives  the  following  expression 
for  the  differential  D^v{A,B)  evaluated  at  {S,T). 

D^v{A,B)iS,T)    = 

DCviliil  X  /3)((1  X  a)iA,B)))){D['(  o  (1  x  ^)  o  (1  x  a)]{A,B){S,T). 
But  Cv  is  linear  so  that  we  get 

D9v{A,B){S,T)  =<  F,Z?[7  o  (1  x  /?)  o  (1  x  a)]{A,B){S,T)  >  .     (5.4) 

We  compute  the  differentials  of  the  maps  a,  /3  and  7  next. 

The  differential  of  /3.  Given  G  €  T"  and  I  6  V",  we  must  compute 
DP{G){L).  Since  (3{G)G  =  /,  for  1  <  /  <  Jb  <  n  we  have 

dki0{G)G  +  0{G)Eki  =  0 
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where  Eki  is  the  matrix  with  1  in  the  (A:,/)-entry  and  0  everywhere  else,  so 
that 

dki/3{G)  =  -G-'Ek,G-K 

Hence,  fori,  =  {lij}  €  V", 

D(i{G)iL)    =        J2     hidkifi{G) 

\<l<k<.n 

=        E     kti-G-'Ek,G-') 

l<l<k<n 

=    -G-\     Yl     hiEki)G-'' 

l<l<k<n 

=    -G-'IG-*  €  V".  (5.5) 

The  differential  of  a.  It  is  clear  that  the  map  a  :  Q"  — ►  T"  C  V",  a{B)  =  G, 
is  smooth;  it  involves  only  rational  and  square  roots  functions.  Consider  the 
map 

d:    V"     — ►    ,S»" 
L     ^     LL'^ 

The  restriction  of  d  to  T"  maps  7^  into  Q".  Indeed,  given  x  G  3?",  L'^x  -^  0 
when  x  7^  0  since  l7  is  nonsingular  for  L  £  T^.  Thus, 

x'^LL^x  =  {L'^xf{L^x)  >  0. 

Moreover,  a  is  obviously  smooth.  Hence  a  is  a  diffeomorphism  of  Q"  onto 
T"  with  inverse  a~*  =  d,  and 

Da{B)  =  [Da-\a{B))]-^ 

We  shall  not  compute  Da{B)  (while  straightforward,  it  is  complicated  and 
unnecessary  for  our  purposes);  rather  we  will  provide  an  algorithm  for  com- 
puting Da{B){M)  for  5  e  Q"  and  M  e  SW,  which  is,after  all,  what  we 
are  interested  in.  Now 

Da{B){M)  =  I  €  V" 

if  and  only  if 

Da-^{a{B)){L)  =  M  (5.6) 
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Hence  we  may  compute  Da(B){M)  by  solving  equation  5.6  for  L.  For  {k,l) 
witii  1  <  /  <  A;  <  n,  we  have 

dkia-\G)  =  dki{GG'^)  =  EkiC^  +  a  Elk. 

Hence 

Da-\G){L)    =        Y.     ^»td,ta-\G) 

l<<<»<n 

=        E     l't[EstG'^  +  GEu] 

l<t<»<n 

=     {Y,lsiEst)G^  +  G{J2lstEt,) 

=    LG'^  +  GL^  eS^"".  (5.7) 

Thus  we  must  solve  the  equation  LG^  +  GL"^  =  M  for  the  lower  triangular 
matrix  L.  This  is  accomplished  by  the  following  algorithm. 

Algorithm. 

for  I  =  l,n 

for  j  =  l,i  —  I 

2(7„ 
First  observe  that  the  algorithm  is  well  defined. 

1.  gu  T-^  0  for  all  t  =  1,  n  since  G  G  T". 

2.  all  the  entries  of  L  appearing  on  the  right  hand  sides  of  (5.8)  and  (5.9) 
have  been  computed  during  prior  loop  iterations. 

Next  note  that 

min(«,j) 

iLG^h=     Yl    ^'"S,"- 
fc=i 

To  prove  the  algorithm  correct,  we  order  the  entries  /,j  of  L  using  the  lex- 
icographic order  on  (i,i),  where  1  <  j  <  t  <  n,  and  show  by  induction 
that,  having  computed  /.'j-  for  all  {i',j')  <  (»,i),  we  indeed  obtain  /,j  by 
formultis  (5.8)  and  (5.9). 
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{i,j)  =  (1,1)  :  this  is  completely  trivial;  [LG^  -\-GL^]u  =  hi9u  +  9nhi, 

so  that 

,  mil 

2ffii 

which  is  precisely  (5.9)  in  this  case, 
general  case  :  we  have 

min(i,j)  min{i,j) 

[LG^  +  GL^]ij=     J2    ^'>'9jic+     E    S^klJk.  (*) 

k=l  k=l 

U  i  =  j,  we  get  :  2Yl*k=i  ^ikSik  =  "^m  and  formula  (5.9)  follows.  If  j  <  i, 
then  all  Ijk,  i  <  k  <  min{i,j)  =  j,  and  all  l,k,  1  <  ^  <  j  have  been 
computed  already  according  to  the  induction  hypothesis,  and  formula  (5.8) 
follows  from  (•). 

The  differential  of-y.  The  map  7  is  defined  on  the  space  SW^  x  7^.  We  will 
use  indices  {k,l)  to  refer  to  the  variables  in  the  first  component  and  indices 
{s,t)  to  refer  to  those  in  the  second  component.  In  all  cases  1  <  /  <  ^  <  n 
and  1  <  t  <  s  <  n.  Since  l{A,L)  =  LAL^  for  A  €  5J?",  L  e  T",  we  have 

dkn{A,L)    =     LEkiLT 
d,a{A,L)    =    E,tAL^  +  LAEt,. 

Thus  for  iM,N)  €  53?"  x  V",  we  have 

D-riA,  L){M,  N)  =  LML^  +  NAL^  +  LAN'^  .  (5.10) 

Now,  from  (5.5),  we  readily  obtain 

D(l  X  p){A,G){M,N)  =  {M,-G-^NG-^)  (5.11) 

where  A  €  5»",  G  e  7^,  M  €  53?",  iV  €  V"  and 

1  X  /?  :  53?"  X  T"  — ►  53?"  x  T". 
Similarly,  from  (5.7),  we  get 

D{lxa){A,B){M,N)  =  {M,L)  (5.12) 

where  A  €  53?",  B  €  Q",  A/,  A^  €  53?", 

1  X  a  :  53?"  X  Q"  — ►  53?"  x  T" 
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and  i  6  V"  solves  LCf^  +  GI^  =  N. 

Finally,  combining  (5.4),  (5.10),  (5.11)  and  (5.12),  we  obtain  the  following 

expression  for  the  differential  oi  ^v- 

D^viA,B){M,N)    =    <V,G-HIG'^ -G-^LG-UG-^ -G-^AG-'^L^G-'^  > 
=    <  V,G-\M  -  LG-^A  -  AG-'^L^]G-'^  >  (5.13) 

where  Z  €  V"  solves  the  equation  LG^  +  GL^  =  N. 
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6     The  generalized  gradient  of  E*=i  A,(x) 

We  now  assume  that  we  are  given  a  smooth  function 

X     ^     {A{x),B{x)) 
where  Q,  is  an  open  set  in  3?"*,  and  we  consider  the  composite 

X     ^    ELiA.(x). 

We  want  to  compute  the  generalized  gradient  of  fk.  We  shall  write  Aj{x), 
Bj{x)  for  the  partial  derivatives  of  A{x),  respectively  B{x),  with  respect  to 
the  variable  Xj.  By  equation  3.6,  we  know  that 


fkix)=  m^xJv{Aix),B{x))  (6.1) 

52* 


veS^ 


where 

9viA{x),B{x))  =<  V,Gix)-'Aix)Gix)-'^  >,  (6.2) 

with  G{x)  the  Choleskl  factor  of  B{x). 

Proposition  6.3    The  function  fk  :  ft  — ►  3?  is  regular  and  locally  Lipschitz. 

Proof:  Equation  6.1  gives  /fc(x)  a^  a  pointwise  maximum  over  the  com- 
pact set  iS2 ,  so  that  the  conclusion  will  follow  from  [7,  theorem  2.8.2,  p.  86] 
if  we  show  that  the  functions  'iv{A{x),B{x))  are  regular  and  locally  Lips- 
chitz of  some  rank  /  for  all  V  €  (S* .  They  certainly  are  regular  since  they 
are  smooth  functions  of  x.  Moreover,  the  differentiability  of  ^viA{x),  B{x)) 
implies  that  it  is  Lipschitz  of  rank  ly  in  a  neighborhood  of  x.  More  precisely, 
by  the  Mean  Value  Theorem, 

I  ^v{Aix'),  B{x'))-^v{A{x"\  Bix"))  I  <  II  V^viAix'"), B{x"'))  \\  \\ x'-x"  \\  . 

for  some  x'"  on  the  line  segment  from  x'  to  x" .    Now  let  J\f  he  a,  convex 
compact  neighborhood  of  x  in  fi  and  let 

/  =  max  II  V*v(^(j/),5(j/))  II 

where  the  max  is  taken  over  all  j/  €  A^  and  V  €  ^j  •  Then,  each  '9v{A{x),B{x)), 
V  €  tS* ,  is  Lipschitz  of  rank  /  in  M .  □ 
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Recall  that  by  theorem  3.9,  the  elements  V  €  S2  which  achieve  the 
maximum  in  6.1  are  precisely  those  elements  of  U  £  S^  of  the  form 

U  =  QiQj  +  Q2U2QI 

where  U2  £  ^^''CO  and  Qi  €  0„.r,  Q2  £  On,t  are  such  that 

(?  =  [Qi  :  Q2  :  Qs],  Qz  €  0„,„_._, 

diagonalizes  G(x)"M(i)G(x)"-^,  i.e. 

Q^G{x)-^A{x)G{x)-'^Q  =  A 

where  A  =  diag  {Xi , . . . ,  Xn)  and  the  eigenvaJues  satisfy  (3.7).  Let  us  write 
M{x)  for  the  set  of  such  matrices  U.  Following  the  terminology  of  [16]  we 
shall  call  the  matrices  U  £  M{x),  as  well  as  the  matrices  U2  £  ^2~'^i.^)i  dual 
matrices. 

Theorem  6.4    The  generalized  gradient  of  fk  at  x  £  Q.  is  given  by 

dfk{x)=  {v£^"'\vj  =<  U,G-^[A,{x)-L,G-'^  A{x)-A{x)G-'^ lJ]G-'^  >  } 

(6.5) 
where  G  =  G{x)  is  the  Choleski  factor  of  B{x),  Lj  =  Lj{x)  solves 

LjG^  +  GLJ  =  B,{x) 

and  U  £  M{x). 

Proof:  Again  this  follows  from  [7,  theorems  2.8.2,  p. 86  and  2.8.6,  p.  92] 
once  we  show  that  the  set  on  the  right  hand  side  of  equation  6.5  is  convex. 
Let  A  denote  this  set.  Thus,  given  q^,  1  <  i  <  a,  with  0  <  a,  <  1  and 
^  a,  =  1,  and  v'  £  A,  1  <  i  <  s,  we  must  show  that 

s 

J^a.u'  £A. 
•=i 

It  is  enough  to  check  that  J2  °'i'"j  's  of  the  required  form  for  all  j  =  1, . . . ,  m. 
Let 

Z  =  G-'^[Aj{x)  -  LjG-^A{x)  -  A{x)G-^LJ]G-'^. 

Then,  vj  =<Ui,Z  >,  1  <  j  <  m,  and 

a  3 
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Now  Ui  =  QiQj  +  Q2UiQ2  for  U^  €  S^-'{t),  and 

1=1 


t=i  «=i 


Thus,  the  result  follows  from  the  convexity  of  S2~^{t).  □ 

Now  let  fl{x\d)  denote  the  generalized  directional  derivative  of  fk  at  x 
in  the  direction  d,  [7,  p.  25].  Then 

fk{x;d)  =  max{<  v,d>\  v  £  dfk{x)} 

by  [7,  Proposition  2.1.2,  p.  27].  Since  fk  is  regular  at  i,  the  usual  one- 
sided directional  derivative  fl(x;d)  exists  and  is  equal  to  f^[x;d),  see  [7, 
Definition  2.3.4,  p.  39],  so  that 

fl{x;d)  =  Tnax{<  v,d>\  v  e  dfk{x)]. 

Using  our  characterization  of  r  €  dfk{x)  given  in  equation(6.5),  we  get 

m 

<v,d>     =     ^Vjdj 

m 

=    T,d:  <  U,G-^[Aj{x)  -  LjG-'^A{x)  -  /l(i)G-^iJ']G-^  > 

m 

=     <  U,'^djG-'^[Akix)  -  IjG-M(i)  -  A{x)G-'^ lJ]G-'^  > 

3  =  1 

with  U  G  M{x),  i.e.  U  =  QiQj  +  Q2U2QJ.  Thus 

<v,d>    =    <  Ir,Mi{d)>  +  <U2,M2id)> 
=    TrMi{d)+<U2,M2{d)>, 

where 

TTl 

Mi{d)  =  Y^djQ'[G-'^[Aj{x)  -  LjG-^A{x)  -  A{x)G-'^ lJ]G-'^Qi  €  5r 
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and 

m 

M2{d)  =  Y^  djQlG-'[A,ix)  -  LjG-'A{x)  -  A{x)G-'^LJ]G-'^Q2  €  S^. 

3  =  1 

We  obtain  the  following  result. 

Proposition  6,6   The  directional  derivative  fj^(x;d)  is  equal  to 

TrMi{d)  +  sum  of  the  {k  —  r)  largest  eigenvalues  of  M2{d). 

Proof:  We  have 

fl{x;d)     =     Tnax{<v,d>\  V  edfkix)} 

=    TrMi{d)+       max       <  t/2,M2(d)  > 

U,€S^-'{t) 

But  by  the  results  of  section  3, 

max       <  U2,M2{d)  > 
u.eSt'it) 

is  precisely  the  sum  of  the  k  —  r  largest  eigenvalues  of  M2{d).  O 
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7     The  ratio  of  the  first  two  eigenvalues  of  a  sym- 
metric definite  pencil 

We  now  consider  the  following  problem. 
Problem  :     Given  a  smooth  function 

we  want  to  find 

min  Mil  (7.1) 

where  Ai(x)  >  A2(x)  are  the  two  largest  eigenvalues  of  the  symmetric  def- 
inite pencil  h{x)  =  {A{x),B{x)).  Observe  that  the  above  problem  is  most 
interesting  when  Ai(x)  <  0.  We  will  come  back  to  this  point  below,  but 
first  we  recall  Clarke's  characterization  of  the  generalized  gradient  of  a  quo- 
tient (see  [7]).  In  the  following  discussion,  the  functions  are  assumed  to  be 
Lipschitz  near  the  point  i. 

By  [7,  Proposition  2.3.1,  p.  38],  we  know  that 

d{tf){x)  =  tdfix) 

for  any  (  €  3?.  In  particular,  d{-f){x)  =  -df{x).  We  also  have  the  chain 
rule  for  generalized  gradients,  [7,  theorem  2.3.9,  p.  42].  More  precisely,  pven 
«7  :  3?"  — ►  »  and  /i  :  3?"*  — ►  JJ",  with  h,  :  3?"*  — ►  ??,  1  <  i  <  n,  denoting 
the  components  of  h,let  f  =  g  o  h  denote  the  composite  of  g  and  h.  Then 
/  is  Lipschitz  nejir  a;  if  ^  and  /i,,  1  <  t  <  n,  are  Lipschitz  near  i,  and 


df{x)Cco\j2<^,^. 


^i  €  dhi{x),a  =  (ai,...,CT„),CT  €  dg{h{x)) 


Moreover,  equality  holds  and  /  is  regular  at  x  if  the  following  conditions 
hold. 

1.  g  is  regular  at  h{x). 

2.  hi  is  regular  at  i,  1  <  i  <  n. 

3.  for  every  a  e  dg{h{x)),  cr,  >  0  for  1  <  i  <  n. 

The  quotient  rule  is  now  a  consequence  of  the  chain  rule.  Consider  the 
map  g{u,  v)  =  u/v  defined  on  {(u,  v)  ^lk^\v  ^  Q).  This  map  is  continuously 
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differentiable,  hence  strictly  differentiable,  at  any  point.  Thus  g  and  —g  are 
both  regular  everywhere,  and  the  generalized  gradient  of  5  at  (u,v)  is  just 
the  gradient  of  g,  i.e. 

dg{u,v)  =  {-, — 2)- 
Given  a,  (3  :  S?"*  — ►  Jf,  we  let  h{x)  denote  the  maps 

hix)  =  {±aix),±0{x)), 
and  apply  the  chain  rule  to  the  composite  g  o  h.  We  get 
Lemma  7.2  For  a,^  :  SJ"*  — ►  »  with  /?(i)  7^  0, 

P{x)da{x)  -  a{x)d(i(x) 


.©(x,C 


fi\x) 
and  equality  holds  in  any  of  the  following  cases. 

1.  oc{x)  >  0,/3(i)  >  0,  Q,  —/3  are  regular  at  x 

2.  q(i)  <  0,/9(i)  >  0,  a,  /?  are  regular  at  x 

3.  q(i)  >  0,/3(i)  <  0,  -a,  -/3  are  regular  at  x 

4.  o.{x)  <  0,/3(x)  <  0,  -Q,  (3  are  regular  at  x. 

Proof:  We  prove  the  first  case;  the  others  are  entirely  similar.  Now 

||^  =  -5(a(x),-/3(x)). 
Thus  we  conclude  that 

Since  d{—fi){x)  —  —dl3{x),  the  set  on  the  right  hand  side  is  equal  to 

1 W^) ^^      da{x)^^2  e  d(i{x)j. 

Next,  we  observe  that  this  set  is  convex  since  da{x)  and  d(i{x)  are  con- 
vex ([7,  Proposition  2.1.2,p.  27]).  Finally,  we  easily  verify  that  the  equality 
conditions  are  satisfied.  We  have 

/1(I)  =  (Q(X),-/3(X)), 
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so  that  hi  is  regular  at  i,  for  i  =  1,2,  and 

clearly  has  nonnegative  components.  D 

We  now  return  to  problem  (7.1).  We  make  the  following  assumptions  on 
the  map  h. 

-A{x)  €  Q",  for  all  z  €  ft  (7.3) 

Ai(i)  is  a  simple  eigenvalue  for  all  x  e  il.  (7.4) 

Hypothesis  (7.3)  implies  that  Xi{x),  and  hence  all  the  other  eigenvalues, 
are  strictly  negative  for  all  i  6  ft.  Hypothesis  (7.4)  implies  that  Ai(x)  is  a 
smooth  function  of  i.  Thus  so  is  — Ai(i),  and  both  are  regular  at  x.  We 
point  out,  however,  that  (7.4)  is  not  a  very  restrictive  assumption.  Indeed, 
if  x'  is  a  solution  of  (7.1)  with  Ai(i*)  <  0  then  A2(i*)  <  Ai(x*)  unless  Ai 
has  multiplicity  at  least  2  near  x'. 

Consider  the  following  lemma,  whose  proof  is  completely  obvious. 

Lemma  7.5  Let  a{x),  (3{x),  x  €  ft  C  3?"*  denote  two  real  valued  functions 
such  that  /3(x)  :^  0  for  all  x  e  ^  and 

^>0,/ora//xeft. 

Then  xq  €  ft  minimizes  ||||  over  ft  if  and  only  if  xq  maximizes  |i||. 

Thus,  since  ^ijfj  >  0,  for  all  x  G  ft,  by  (7.3),  we  get  that  (7.1)  is  equivalent 
to 

find        min^T^-  (7-6) 

xen  -Ai(x)  ^     ' 

Finally,  it  is  obvious  that  (7.6)  is  equivalent  to 

.    Ai(i)  +  A2(x) 

nnd        min -^— ^. 

xeo       -Ai(x) 

So  we  have  replaced  problem  7.1  with  the  following  one. 

Problem  :  Given  the  smooth  function 

h:    a    ^      53?"  xQ", 
X     ^     (A(x),5(x)) 
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let  fkix)  =  9kWx)),  the  sum  of  the  k  largest  eigenvalues  of  h{x),  and  let 

'^  '     -M') 

Then  find 

nunp(x).  (7.7) 

Now,  for  I  6  fi,  /zCa;)  <  0,  — /i(x)  >  0,  /2  is  regular  at  x  by  Proposi- 
tion 6.3  and  — /i  is  regular  at  x  by  7.4.  Hence  we  may  apply  lemma  7.2, 
case  2,  to  the  function  p{x)  and  we  have 

M.)=/'(-W-W-{'(-W'W.  (7.8) 

We  may  apply  the  results  of  section  6  to  the  computation  of  the  general- 
ized gradient  of  p(i).  So  let  G  =  G{x)  be  the  Choleski  factor  of  B{x)  and  let 
Q  be  an  orthonormal  matrix  which  diagonalizes  G{x)~^  A{x)G{x)~^ .  Let  t 
be  the  multiplicity  of  A2(a:),  and  let 

Q  =  [Qi  :  Q2  :  Q3]  (7.9) 

where  Qi  =  [qi]  e  C»"'\  Q2  G  C?"''  and  Q3  e  C?"'"-'-^  For  I  <  j  <  m,  let 
Aj{x)  and  Bj{x)  denote  the  partial  derivatives  of  i4(i)  and  B{x)  respectively, 
and  let  Lj  =  Lj(x)  be  the  lower  tricingular  matrix  which  solves  the  equation 

LjG^  +  GLj  =  B,{x). 

Write  Zj  for  the  matrix 

G-^  [Aj{x)  -  LjG-^A{x)  -  A{x)G-'^LJ]  G"^.  (7.10) 

Then,  theorem  (6.4)  ^ves 

5/1(1)  =  {we^"'\wj=<  q,qJ,Z,  >,  1  <  ;•  <  m  }  , 

dh{x)  =  {veW^\vi=<  qiql,  Zj>  +  <  Q2U2QI,  Z,>,l<j<m), 

where  U2  runs  over  all  matrices  in  SUt),  namely  those  symmetric  t  x  t 
matrices  U2  with  0  <  U2  <  It  and  TrU2  =  1.  The  following  theorem  follows 
from  equation  (7.8). 
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Theorem  7.11    The  generalized  gradient  of  p  is  given  by 

for  I  <  i  <  m 


=  (" 


ap(i)  =   <  t>  e  s 


AfW ■2^1  <'■''' 


We  also  obtain  the  following  characterization  of  the  directional  deriva- 
tives of  p{x).  For  de'St"',  let 

m 

m(d)    =    l]<ijgfZjgie»,  (7.13) 

i=i 

m 

M(d)    =    J^d,Q'^Z,Q2eS^K  (7.14) 

Proposition  7.15  For  d  E  3?"*,  u;e  /laue 

p'(i;d)  =  ^        m{d)  -  -—-—  x  (the  leirgest  eigenvalue  of  M{d)  ). 
Af(x)  Ai(i) 

Proof:  As  in  Proposition  6.6,  the  directional  derivative  of  p  is  given  by 

p'{x; d)  =  max{<  v,d>\  v  e  dp(x) }. 

But 

m 

<v,d>     =    J^vjdj 

=    ^  "■<")- AIM  <^-'^('">- 
Now,  since  -^^  >  0, 

^'^'^'''^  =  ^  "'^''^  "  A;{^  """^  {<  ^2,M(<i)  >  I  C/2  G  ^j'-^O}  , 
and  the  result  follows  from  section  3.   □ 
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8     The  gap  between  the  first  two  eigenvalues  of  a 
symmetric  definite  pencil 

The  discussion  of  the  previous  section  on  the  ratio  of  the  first  two  eigenvalues 
of  the  symmetric  definite  pencil  h{x)  =  {A{x),B{x)),  where 

applies  almost  verbatim  to  the  study  of  the  gap  between  these  eigenvalues. 
More  precisely,  if  Ai(i)  >  A2(i)  are  the  two  largest  eigenvalues  of  h(x), 
we  wish  to  maximize  the  gap  between  X\{x)  and  A2(x),  or  equivalently,  to 
minimize  the  difference 

/2(x)-2/i(x)  =  A2(x)-A,(z). 

We  assume  as  before  that  hypotheses  (7.3)  and  (7.4)  hold  so  that  Ai(i)  <  0 
is  simple,  and  hence  smooth,  for  all  x.  We  may  then  apply  Qarke's  result 
on  the  generalized  gradient  of  a  sum,  ([7,  Corollary  3,  p.  40]),  and  our 
computations  of  the  generalized  gradient  of  /fc(i)  to  obtain  the  following 
theorem. 

Theorem  8.1  Let  r;(i)  =  /2(x)  -  2/i(x).  Then  the  generalized  gradient  of 
T]  is  given  by 

dr,{x)     =     {ve^"'\v,  =  {Q2U2Q'^-qiq[,Z,),  forl<j<m}{8.2) 

where  Ui,  Q2  and  Zj,  \  <  j  <  m,  are  as  in  theorem  (7-11)- 

Similarly,  we  obtain  the  characterization  of  the  directional  derivative  of 
r]{x). 

Proposition  8.3  For  d  G  ^J"*,  we  have 

rj'{x;d)  =  (  the  largest  eigenvalue  of  M{d)  )  -  m(<f), 

where 

M{d)  =  Y,d,QlZ,Q2. 
3=1 

m 
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9     Optimality  Conditions 

We  now  describe  optimality  conditions  for  the  constrained  minimization  of 
the  functions  p{x)  ajid  r]{x)  defined  in  the  two  previous  sections. 
Thus  we  let 

C  =  [ci...c^]     €     J?"^'^'" 
b  6     ST' 


and  we  consider  the  following  two  problems 

(PI) 


min  p{x) 

subject  to 

=    b 
X      <   u 


j  Cx 


and 

(P2) 


min  T](x) 

X 

subject  to 

=    b 
X      <    u 


/  Cx 


We  assume  that  the  hypotheses  (7.3)  and  (7.4)  hold. 

We  now  apply  the  Lagrange  multiplier  rule,  see  [7,  Theorem  6.1.1,  p. 
228],  to  derive  optimality  conditions  for  problems  (PI)  and  (P2). 

Theorem  9.1   Let  x  G  J?""  be  a  feasible  point  for  problems  (PI)  and  (P2). 
Then 

1.  a  necessary  condition  for  x  to  solve  (PI)  is  that  there  exist  a  dual 
matrix  U-2  €  .S3?',  where  t  is  the  multiplicity  of  A2  at  x,  and  vectors  of 
Lagrange  multipliers  a  €  K"' ,  and  7  €  S?"*  satisfying 

for  l<j<m  (9.2) 

Tr(U2)     =     1  (9.3) 

0  <   U2     <     I  (9-4) 
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and 


7,=0 

if     Ij  <  Xj  <  Uj 

fj>0 

if            X,  =  I, 

7j<0 

if             Xj  =  Uj 

(9.5) 


2.  similarly  a  necessary  condition  for  x  to  be  a  solution  of  (P2)  is  that 
there  exist  U2,  en,  and  7  as  above  satisfying  all  of  the  above  equations 
but  with  (9.2)  replaced  by 

{Ui^QlZjQi)  -  <  qiq[,Zj  >     =     <  a,Cj  >  +7^, 

fori  <j  <m     (9.6) 

Here  the  matrices  U2  and  Zj  are  as  in  theorems  (7.11)  and  (8.1). 

We  devote  the  remainder  of  this  section  to  showing  how  to  derive  a 
descent  direction  for  the  functions  p(a;)  and  r]{x)  in  the  case  that  all  the 
optimality  conditions  are  satisfied  by  U2,  a,  and  7  except  possibly  for  con- 
dition (9.4). 

Theorem  9.7  Suppose  that  x,  U2,  a,  f  satisfy  conditions  (9.2),  (9.3)  and 
(9.5).  Let  0  be  an  eigenvalue  of  U2  with  normalized  eigenvector  v,  and  let 
s  €  R.  If  d  ^  3?"* ,  6  £  ^  solve  the  following  system  of  equations 


^djQlU2Q2-6I     =     svv'^ 


(9.8) 


(9.9) 
(9.10) 


Cd     =     0 

dj  =  0      :/     Xj  =  Ij  or  Xj  =  Uj 

then  d  is  a  feasible  direction  for  (PI)  and  (P2)  with  directional  derivatives 
1. 


p'{x;d)=  < 


V'{x;d)  = 


-s{d  -  1)     if    s>0 
-s9  if    s<0 
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Proof:  Recall  the  definitions  of  the  matrix  M{d)  =  Xl^i  d^Qj ZjQ2  and 
the  number  m{d)  =  E^zzi  djqf  Zjqi  (see  (7.14)  and  (7.13)).  Equation  (9.8) 
gives 

M{d)  =  6I  +  svv'^, 

so  that  all  the  eigenvalues  of  M{d)  are  equal  to  6  except  for  one  which  equals 
6  +  s.  Let  I  =  max {6,6  ^s}.  Now,  taking  the  inner  product  of  equation  (9.8) 
with  the  matrix  f/j  yields 


or,  equivalently, 


Y,  d,  {U2,Q^Z,Q2)  -  6Tr{U2)  =  s9. 


(9.11) 


j=i 


Recall  from  Proposition  7.15  that 


K  the  optimality  conditions  (9.2)  and  (9.3)  are  satisfied,  equation  (9.11) 
becomes 


i.e. 


At  I  a;  I  V  ' 


Ai(x) 

But  Cd  =  0  by  assumption,  and  moreover,  if  optimality  condition  (9.5)  is 
satisfied,  then  7^d  =  0  as  well.  Hence 


Mil 

Ax(x) 


7n{d)  —  6  =  s0, 


and  we  conclude  that 


Ai(x) 


A2(x) 

r4?^  if  ^>o 


jfk      if    ^<0- 
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We  consider  the  function  t](x)  next.  Recall  from  Proposition  8.3  that 

r]'{x;d)  =  l-Tn{d). 
Equation  (9.11)  becomes 

m 
'^dj[q[Zjqi-\-  <  a,Cj  >  +7^]  -  6  =  sO, 

or  equivalently 

m{d)  =  a'^Cd  +  7^d  -6  =  3$, 

and  we  get 

m{d)  -6  =  sO. 

We  conclude  that 


'(     A\      I  -^(^-1)    if    s> 


It  is  now  easy  to  generate  a  descent  direction  for  p{x)  or  j/(x)  in  the  case 
the  matrix  U2  does  not  satisfy 

0  <  1/2  <  /. 

Indeed,  U2  must  have  an  eigenvalue  6  falling  outside  the  interval  [0,1].  If 
^  <  0,  we  simply  choose  s  to  be  negative,  say  5  =  -1,  while  if  ^  >  1,  we 
choose  s  to  be  positive,  e.g.  s  =  I, 
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10      Perturbations  of  a  disk 

We  now  describe  our  families  of  perturbations  of  a  disk.  Consider  a  disk 
D  centered  at  the  origin  (for  simplicity),  and  of  fixed  area.  We  write  D  as 
the  union  of  a  disk  Do  of  radius  Rq  and  an  annulus  Di  of  width  Ri.  We 
divide  Di  into  sectors  of  angle  6  —  27r/M  and  we  write  Z?i  as  a  union  of 
isoparametric  quadrilateral  elements  with  8  nodes  as  in  the  following  picture. 


where  one  rectangular  element  is  shaded  and  the  nodes  are  displayed.  The 

disk  Dq  is  triangulated  with  isoparametric  triangle  of  type  2  [5].  Now  the 
nodes  of  the  rectangular  elements  are  all  of  the  form 

(iZo  A — iZi)(cosTn^,sinm^) 
n 

where  n  is  an  even  integer  equal  to  twice  the  number  of  "layers"  of  rectangles 
in  D\  (n  =  4,  M  =  16  in  the  above  picture),  k  is  an  integer  between  0  and 
n  specifying  a  node  along  a  single  ray,  0  is  the  angle  of  a  single  sector  and 
m  is  an  integer  between  0  and  M  -  \  specifying  a  single  sector. 

We  perturb  the  disk  D  by  perturbing  the  annulus  D\  while  leaving  the 
disk  Dq  fixed  as  follows. 

for  each  ray  we  introduce  a  parameter  Xm  and  we  perturb  the 
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nodes  by 


{Rq  +  Xm  —  R\){cosTnff,sinTn9y, 


n 


the  parameter  Xm  is  taken  to  lie  in  a  small  neighborhood  of  1  (not 
containing  0!). 

Thus,  writing  i  =  (xq,  . . .  ,xjvf-i),  we  obtain  a  family  il(x)  of  deforma- 
tions of  our  original  disk  D,  parametrized  by  x  lying  in  a  neighborhood  of 
(1,...,1)  in  K"*.  We  need  to  derive  sufficient  conditions  on  the  values  of 
Rq,  Ri,  the  angle  0,  and  the  parameters  i,,  to  insure  the  nondegeneracy 
of  the  rectangular  elements.  To  this  end,  we  describe  in  some  details  the 
construction  of  these  elements. 

Consider  the  standard  restricted  biquadratic  element  K  (see  [5]) 

s 


1 

02 

as 

ai 

-1 

06 

as 

1 

03 

a? 

04 

-1 

equipped  with  the  subspax;e  of  Q2  spanned  by  the  following  basis  duaJ  to  the 
nodes  ai,...,a8.  (Recall  that  Q2  is  the  space  of  polynomials  in  r,s  spanned 
by  the  monomials  r"*"*  with  0  <  n,m  <  2  [5]). 

ai  =  (l,l):Ai     =     i[(l  +  r)(l  +  .)-(l-r2)(l  +  5)-(l  +  r)(l-52)] 

a2  =  (-l,l):A2     =     ^[(l-r)(l  +  .)-(l-r2)(l  +  5)-(l-r)(l-.2)] 

a3  =  (-l,-l)  :   A3     =     i[(l-r)(l-5)-(l-r2)(l-.)-(l-r)(l-52)] 
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a4  =  (l,-l):A4  =  i[(l  +  r)(l-5)-(l-r2)(l-.)-(l  +  r)(l-52)] 

as  =  (0,1)  :  As  =  ^(1  -  r^Xl  +  5) 

ae  =  i-l,0)  :  Xe  =  ^(1  -  r){l  -  s') 

a7  =  (0,-l)  :  A7  =  \{l-r'){l-s) 

as  =  (1,0)  :  As  =  ^{l  +  r)il  -  s') 

If  we  let  09  denote  (0,0),  then  {Ai,. ..,  Ag}  spans  the  subspace  Q2  of  Q2 
consisting  of  those  p(r,  s)  €  Q2  such  that 

X(P)  =  0 

where 

4  8 

X(p)  H  4p{ag)  +  YlPi''i)  -  2l^p(a.). 

t=l  .=5 

Then  the  corresponding  isopaxametric  elements  K  are  constructed  as 
follows.  Given  8  points  6^, ...  ,6*  in  §?^,  we  define  the  map 

1 

and  we  set  K  =  F{K).   The  element  A'  is  nondegenerate  if  the  map  F  is 
invertible.  Let  us  write  F  =  (^1,^2)  where 

/;(r,5)  =  X;A,(r,5)&J,  i  =  l,2. 
i=i 

Then  the  rows  of  the  jacobian  matrix  JF  of  F  are  the  gradients  of  Fi,  F2, 

8 

Vi^=53VA,6f,  i=l,2. 

We  compute  the  grcidients  VAj  for  1  <  ;  <  8. 

VAi     =     Q(l  +  3)(2r  +  5),;^(l  +  r)(25  +  r)^ 
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VA2 

VA; 
VA4 

vx. 


,2  =  Q(i  +  5)(2r-5),^(l-r)(2.-r)) 

.3  =  Q(1-3)(2r  +  5),i(l-r)(25  +  r)) 

^  =  Q(i-,)(2r-.),i(l  +  r)(2.-r)) 

^5  =  (-r(l  +  .),^(l-r2)) 


V      '  "2'  '} 

VAe     =     (-i(l-.2),-,(i-r)) 

VA7     =     (-r(l-5),-^(l-r^)) 
VAs     =     Q(l-52),-5(l  +  r)) 

Now  let  K  denote  one  of  the  elements  of  the  ring  D\.    The  nodes  fr', 
1  <  J  <  8,  of  the  perturbed  K  are  of  the  form 

h^     =     (i2o  +  a(^-^)i2i)(cosei,sinei) 

6®     =    (iio  +  a(-)  ^i)(cos0i,sinei) 

64     =    (i?o  +  a(^^^j  iZi)(cosdi,sin«i) 

6'^     =    (i?o  +  /3('^-ti-)iZi)(cosd2,sind2) 
6«     =    (i2o  +  /3(^)i?i)(cos^2,sind2) 
63     =     (72o  +  /3(^^)i2i)(cos^2,sin«2) 

,s     =     ,^,,(i±i);,,Kco.(^).in(^), 

,.     =     ,^,,(i^)«,Kcos(^),si„(^)) 

where  a,/3,7  are  the  deformation  parameters,  fc  is  an  odd  integer,  0  <  A:  <  n. 
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The  jacobian  determinant  det  JF  of  the  map  F  is  given  by 

where  (Aj)^,  (Aj),  denote  the  partial  derivatives  of  Xj  with  respect  to  r,s 
respectively.  Let  <f>  =  (02  -  6\  )/2.  We  have  the  following  theorem  whose 
proof  is  given  in  the  appendix. 

Theorem  10.1  The  jacobian  determinant  detJF  of  the  map  F  is  nonzero 
for  all  Ro  >  0,  Ri  >  0  and  Q,P,-y,4>  such  that 


I    Q-  1    I       <       € 

1/3-11     <    e 
I   |7-1  I     <    f 


where 

and  3/5  <  cos</>  <  1. 
Remark  10.2 


1  1  —  cos  <f) 
~  2  l  +  cos4> 


1.  The  condition  on  the  angle  <l>  gives 

0  <  <^  <    (approximately)  53°. 

2.  The  corresponding  range  of  values  for  e  are  : 

€  =     0.2      for  cos  <f>  =  3/5 
€  =     0        for  cos  4>  =  I. 

The  above  theorem  shows  that  all  the  quadrilateral  elements  of  the  ring 
Di  are  nondegenerate  if  M  is  chosen  larger  than  8,  and  if  the  parameters 
I,,  0  <  M  —  1,  satisfy  |  i,-  —  1  |<  f,  where 

^  1  1  -  cos  ^ 
'  ~  2  1  +  cos  ^  ■ 
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11      Description  of  the  Algorithm 

We  present  the  algorithm  we  used  to  minimize  the  function  the  ratio  A1/A2 
of  the  two  smallest  eigenvalues  of  the  Laplacian  as  a  function  of  the  shape  of 
the  domain  Q.  This  is  a  suitably  modified  version  of  an  algorithm  developed 
by  M.  Overton  [16].  We  briefly  describe  the  changes  that  need  to  be  made 
and  refer  the  reader  to  [16]  for  a  detailed  account. 

First  of  all,  we  consider  the  family  ^(x),  x  G  5J"*,  of  deformations  of  a 
disk  constructed  in  section  10.  The  regons  il{x)  are  equipped  with  a  trian- 
gulation  made  of  isoparametric  rectangles  and  triangles  of  type  2  which  is 
uniform  in  i.  By  this  we  mean  that  the  number  of  elements  as  well  as  their 
relative  position  with  respect  to  one  another  is  independent  of  the  choice 
of  X.  Then,  standard  finite  element  techniques  allow  us  to  transform  the 
problem  of  computing  the  eigenvalues  of  the  Laplacian  on  fi(i)  into  the  fol- 
lowing finite  dimensional  generalized  eigenvalue  problem  [5,  6,  19].  Compute 
the  eigenvalues  of  the  symmetric  definite  pencil  {A{x),B{x)),  where  A{x) 
and  B{x)  are  the  stiffness  and  mass  matrix  ,  respectively,  of  the  Laplacian 
associated  to  the  given  tri angulation.  (A{x),B{x))  is  a  smooth  function  of 
X,  thanks  to  the  independence  of  the  triangulation  upon  the  parameters. 

Thus,  following  our  discussion  in  section  7,  we  attempt  to  minimize  the 

function 

Xi{x)  +  X2{x) 

Pi^)  = rr-T 

Ai(x) 

where  A2(x)  <  Ai(i)  <  0  are  the  two  largest  eigenvalues  of  the  pencil 
{—A{x),B{x)).  The  basic  strategy  is  to  generate  a  sequence  of  iterates 
x"  converging  to  a  (local)  minimizer  of  p(x).  Hence,  starting  with  an  initial 
guess  1°  near  (1,...,1),  so  that  the  corresponding  region  fi(i)  is  near  the 
disk,  the  algorithm  ought  to  produce  a  sequence  x"  converging  to  (1, . . . ,  1). 
Now  the  point  x*"*"^  is  obtained  as  x"  +  <f  and  the  primary  task  of  the  al- 
gorithm is  to  generate  the  step  d.  The  scheme  implemented  by  Overton  is 
to  get  d  by  partially  solving  a  linear  program  whose  constraints  are  derived 
from  the  optimality  conditions  for  a  minimizer  of  the  function  to  be  opti- 
mized (see  [16]  for  an  explanation  of  partial  solution  of  the  linear  program). 
In  our  particular  situation,  the  linear  program  that  needs  to  be  solved 
has  the  following  form.  It  depends  upon  the  multiplicity  t'  of  A2  at  the 
solution.  We  compute  an  estimate  t  of  t'  based  on  our  knowledge  of  the 
current  iterate  x"  and  in  terms  of  a  tolerance  r  as  follows. 

A2(x'')-A,+,(x'')     <     rmax{l,|A2(x*')|} 
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AjCx")  -  VaCx")     >     Tmax{l,\X2{x'')\}. 
We  also  normalize  the  problem  of  finding 

minp(x) 

by  requiring  that  the  area  of  the  re^on  Q{x)  remains  constant.  Thus  we 
introduce  a  linear  constraint  on  d  of  the  form  (Fd  =  0,  where  the  vector  c 
is  the  gradient  of  the  area  of  ^{x")  with  respect  to  x.  Also  let  qi  =  9,(1"), 
1  <  »  ^  "»  fonn  an  orthonormal  basis  of  eigenvectors  for  (A{x''),B{x'')), 
and  let  Zj  =  Zj{x^)  be  as  in  (7.10).  We  write  Q2  =  [92  •••9i+t]»  •^i  =  -^iC^)? 
A2  =  ^2(1),  and  Aj  =  Aj{x^)  for  the  ^'''-partial  derivative  of  A{x)  at  x". 
The  linear  program  is  then  given  by 


{LP") 


min  -r-  >    djqi  Z,gi  -  -^r- 

<i'S       X\j^-y      ^     *        ^  Al 


subject  to 

m 

^/-  ^(/,Q[Z,Q2  =  diag{QX  -  '^i.-.-X^t  -  A2)  (11-1) 
i=i 

^-E'^^^i^^J^i  <Ai-A2  (11.2) 

i=i 

m 

^  -  Zl  djqlZjqk  >  Afc  -  A2,  2  +  <  <  fc  <  n  (11.3) 

c^<f  =  0  (11.4) 

l<d<u  (11.5) 

||d||oo  <  r  (11.6) 

where  r  is  a  trust  region  radius  updated  by  the  algorithm,  and  /,  u  are 
determined  by  the  neighborhood  of  (1,...,1)  in  3?"*  over  which  p{x)  is  to 
be  optimized.  The  set  of  constraints  (11.1)  arise  from  a  linearization  of  a 
suitable  system  of  nonlinear  equations  characterizing  the  conditions 

A2(i)  =  ...  =  Ai+t(x)  =  u\ 

constraints  (11.2)  and  (11.3)  ensure  that,  up  to  first  order,  the  values  of  Ai 
and  A2+<,...,An  at  the  new  iterate  are  still  greater,  respectively  less,  than 
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that  of  A2.  Note  that  the  constraints  (11.1)  form  a  system  of  t(<  +  l)/2  linear 
equations;  each  equation  has  a  Lagrange  multiplier  associated  to  it.  These 
Lagrange  multipliers  are  assembled  into  a  symmetric  matrix  U  with  diagonal 
entries  corresponding  to  the  multipliers  of  the  diagonal  equations  of  (11.1) 
and  the  off-diagonal  elements  of  U  corresponding  to  half  the  multipliers  of 
the  corresponding  off-diagonal  equations  of  (11.1)-  (The  factor  1/2  comes 
from  our  definition  of  the  inner  product  <,>).  The  symmetric  matrix  U 
provides  an  estimate  for  the  dual  matrix  U2  described  in  section  9.  More 
precisely,  we  have 

U2  =  -X,{x'')U. 
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12     Numerical  Results 

We  implemented  the  algorithm  described  above  in  Fortran  77.  The  pro- 
gram includes  a  subroutine  written  by  Overton  to  partially  solve  the  linear 
program  presented  in  the  previous  section.  The  first  few  eigenvalues  and 
eigenvectors  of  the  pencil  {A{x),  B{x))  are  computed  via  subspace  iteration 
as  described  in  [4],  and  subroutines  from  the  Unpack  [10]  and  Eispack  [12]  li- 
braries are  used  to  solve  the  resulting  linear  systems  and  reduced  generalized 
eigenvalue  problems.  The  bulk  of  the  code  that  was  written  specifically  for 
this  problem  is  devoted  to  implementing  the  parametrization  of  the  shape 
n  as  a  function  of  a  vector  x  of  parameters  as  described  in  section  10  and 
the  computation  of  the  matrices  A(x)  and  B{x)  as  well  as  their  partial 
derivatives  with  respect  to  the  parameters  i,. 

In  order  to  test  the  quality  of  our  discrete  model  of  the  geometry  we 
used  our  code  to  verify  numerically  the  celebrated  Faber-Krahn  inequality, 
namely  that  among  all  regions  of  the  plane  of  a  given  area  the  disk  is  the 
shape  with  the  smallest  first  eigenvalue  for  the  Laplacian.  Of  course,  this 
problem  is  much  simpler  than  the  Payne,  Polya,  Weinberger  conjecture  since 
the  function  to  be  minimized,  namely  Ai(i),  is  smooth  at  the  solution.  We 
set  the  values  of  the  radius  of  the  disk  Dq  and  the  width  of  the  annulus 
Di  both  equal  to  0.6.  The  annulus  was  divided  into  32  sectors,  so  that  the 
region  fi  depends  upon  32  parameters.  There  were  four  layers  of  rectangles 
in  Di  for  a  total  of  64  rectangles,  Jind  there  were  64  triangles  in  Dq.  The 
total  number  of  vertices  was  337,  including  the  boundciry  vertices,  resulting 
in  bcLuded  matrices  A{x)  and  B{x)  of  size  305  by  305  with  a  bandwidth 
equal  to  59.  Finally  the  initial  value  of  the  trust  radius  was  set  to  0.1. 
The  algorithm  performed  very  well,  conver^ng  to  a  disk  in  a  few  steps. 
For  example,  starting  with  an  ellipse  as  the  initial  shape,  all  32  parameters 
agreed  to  four  significant  digits  after  10  iterations.  (Recall  that  in  our  model 
disks  are  represented  by  setting  all  parameters  equal  to  a  common  value). 
On  the  other  hand,  the  next  15  iterations  failed  to  produce  agreement  on 
the  next  digit.  These  results  show  that  our  discretization  of  the  geometry  is 
adequate,  but  at  the  same  time  reveal  the  limited  'resolution'  of  our  model, 
namely  ordy  the  first  four  or  five  digits  of  the  parameters  are  geometrically 
significant  for  the  triangulation  of  il  given  above. 

The  behavior  of  the  algorithm  on  the  main  problems,  namely  that  of 
minimizing  the  ratio  y^  or  maximizing  the  gap  between  Ai(x)  and  ^2{x) 
was  much  less  satisfying.  First,  recall  that  the  parameters  with  an  even 
index,  X2,,  correspond  to  rays  that  bisect  the  rectangles  in  the  annulus  Di, 
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while  the  odd  indexed  parameters  lai+i  correspond  to  rays  that  coincide 
with  the  sides  of  the  rectangles  (see  section  10).  We  worked  with  the  same 
triangulation  of  fi(i)  as  that  used  for  the  Faber-Krahn  inequality  and  we 
used  various  initial  shapes  and  several  values  for  the  multiplicity  threshold 
r  (see  section  11).  In  all  cases,  we  observed  the  following  pattern  as  the 
algorithm  proceeds.  At  startup  the  multiplicity  of  A2  is  1  and  the  algorithm 
makes  moderate  progress,  that  is  the  values  of  the  parameters  converge  to 
a  common  value  but  very  slowly,  the  objective  function,  either  p{x)  or  r]{x), 
is  reduced  by  smaller  and  smaller  steps,  and  the  trust  radius  is  steadily  de- 
creased. At  the  same  time  the  gap  between  Aj  and  A3  narrows  until  the 
threshold  r  forces  the  multiplicity  of  A2  to  jump  to  2.  At  this  point  the 
algorithm  goes  through  a  few  iterations  without  succeeding  in  producing  a 
decrease  in  the  objective  function.  For  each  such  failure  the  trust  radius  is 
halved,  and  the  algorithm  eventually  produces  a  descent  direction.  Then 
the  following  happens  for  the  next  few  iterations.  The  objective  function 
is  being  steadily  reduced,  but  the  parameters  no  longer  converge  to  a  com- 
mon value.  Instead,  the  even  indexed  parameters  are  being  reduced,  while 
the  odd  ones  are  being  increased.  Thus  ft(x)  resembles  a  polygon  whose 
sides  are  caving  in  more  and  more  toward  the  interior  of  the  polygon.  At 
the  same  time  the  gap  between  A2  and  A3  widens  and  the  trust  radius  is 
steadily  increasing.  (At  each  iteration  the  trust  radius  is  either  doubled, 
halved,  or  left  unchanged,  depending  upon  the  ratio  of  the  actual  increase 
in  the  objective  function  to  the  first  order  estimate  of  this  increase;  when 
the  multiplicity  is  2  this  ratio  is  usually  close  to  1  and  the  trust  radius  is 
doubled,  wherccis  when  the  multiplicity  is  1  this  ratio  is  small  and  the  trust 
radius  is  halved).  Eventually,  the  threshold  r  forces  the  multiplicity  back  to 
1,  and  the  whole  cycle  starts  again.  The  algorithm  oscillates  between  these 
two  opposite  behaviors  for  a  while,  the  actual  number  of  cycles  depending 
upon  the  value  of  the  convergence  tolerance  (the  algorithm  stops  when  the 
norm  of  the  step  produced  falls  below  a  convergence  threshold,  and  when 
the  optimality  conditions  are  satisfied,  of  course). 

At  present  we  still  do  not  understand  what  is  happening.  The  first  guess 
would  be  that  there  remains  a  bug  in  the  program,  but  extensive  checking 
and  testing  of  the  code  has  failed  to  produce  the  source  of  the  problem  so 
far.  It  is  also  possible  that  our  geometric  model,  while  adequate  for  the 
optimization  of  Ai(x),  is  not  sensitive  enough  for  the  ratio  p{x)  or  the  gap 
7;(i).  Finally,  but  least  likely,  the  behavior  of  the  algorithm  may  actually 
reflect  wild  and  unpleasant  properties  of  p(i)  and  77(1)  near  their  minimum. 
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A     Appendix 

In  this  appendix  we  give  the  proof  of  theorem  10.1.  So  let  K  denote  the 
isoparametric  rectangular  element  of  the  ring  Di  with  nodes 

6^     =     (iio  +  a(^i±i)iZi)(l,0) 

6«     =     (i2o  +  a(^)j2,)(l,0) 

b^     =    (i2o  +  a(^^)i?i)(l,0) 

62     =     {Ro  +  a(-^]  Ri){cose,sm0) 
b^    =    {Ro  +  a(-\Ri){cos9,sm0) 
b^     =     {Ro  +  a(^^j  Ri){cose,sme) 

b^    =    {Ro  +  a{^—^JRi){cos-,sm-) 

b'    =     (i2o  +  a(^— —  jiii)(cos-,sm-) 

where  n  is  a  positive  even  integer  and  /;  is  an  odd  integer  with  Q  <  k  <  n 
and  let 

F{t,s)  =  Y.\,{t,s)V 
1 

be  the  map  sending  the  standard  rectangle  K  onto  K  (cf  section  10).  We 
are  going  to  determine  bounds  on  the  values  of  9  and  a,P,^  which  will 
ensure  the  nonvanishing  of  the  jacobian  determinant,  det  JF,  of  F.  These 
conditions  will  apply  also  to  the  other  elements  of  Di  since  nondegeneracy 
is  clearly  rotation  invariant.  So  far  we  have  0  <  ^  <  27r,  and  we  assume  that 

I  Q  -  1  I,  I /?  -  1  I,  I  7  -  1  I  <  e 

for  some  f  to  be  determined. 

The  following  abbreviations  wUl  be  used  below. 

A     =     Rq  +  a—R\  +  Q — s 
n  n 
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B    =    Ro-\-P-Ri+l3—s 
n  n 

C    =     iZo  +  7-i2i  +  7— 5 
n  n 

We  need  to  compute  the  following  expressions. 

(Ai).  +  (A8),  +  (A4V     =     ^(2r  +  l) 
(Ai)r-(A4).     =     \si2r  +  l) 

(A2)r+(A6V+(A3)r      =      ^(2r  "  1) 

(A2)r-(A3)r     =     ^^(2r-l) 

(A5)r  +  (A7)r       =       -2r 
(A5)r-(A7)r       =       -2rS 

(Ai).  +  (A8),  +  (A4),  =  0 

(Ai),-(A4),  =  ir(l  +  r) 

(Aj),  +  (Ae),  +  (A3),  =  0 

(A2).-(A3),  =  — r(l-r) 

(A5).  +  (A7).     =     0 
(As), -(A7),     =     l-r' 

We  now  compute  the  four  entries  of  the  jacobian  matrix  JF. 
(l,l)-entry  : 

E('^jV^i  =  ^^(2'- +  1)  +  ^^(2'- -  l)cos<?  -  2C7rcos^ 

(2,2)-entry  : 


(2,l)-entry  : 

8 


1  0 

^{Xj)rbi  =  -zB{2t-  l)sind-2Crsin- 
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(l,2)-entry  : 

Now  : 

[-A{2r  +  1)  +  T  5(2r  -  1)  cos  fl  -  2Cr  cos  -]  x 

^  /I 


I    2     n    ^  n  / 


1  .  Ri 


=     _iA/?:^r(l  -  r)(2r  +  l)sine  +  2^-^^^^"  "^  ^^^^  '  ^'^''°  2 

_i5^:^ 7.(1  -  7-)(2r  -  1) cos « sine 
4         n 

+i5^:^(2r-l)(l-r2)cos0sin- 

2        n  •^  0        a 

+C/3^r2(l-r)cos^sine-2C7^r(l-r2)cos-sm-. 


and 


[1    :^r(l  +  r)- ^/3^r(l-r)  cos  ^  +  7^(1- r^)  cos  ^Jx 
[-5(2r  -  1)  sin  6  -  2Cr  sin  -]  = 


n  g 

=     -Ba—r(l  +  r)(2r  -  l)sine  -  Ca—r\l  +  r)sin- 
4        n    ^  n  -i 

_i5/j:^r(l-r)(2r-  l)cos0sine 
4        n  . 

+C3^r-'n  -  r)cosesin^  +  ^^7  — (2r  -  1)(1  -  r2)cos-sind 
'^  n  2       2         n  •^ 


n 
Thus  we  have 


-2C7^'-(l-r2)cos-sin-. 


n 

detJF    =     -i4/3^r(l-r)(2r  +  l)sine  +  ^A7-^(2r+l)(l-r^)sin- 


n 
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But 


so  that 
detJF 


-lBf—(2r-  l)(l-r2)cos-sin^ 
2        n  ^  ^^  '        2 


^.C7/3:^^2(l  -  r)cos  ^  sine  -  C/3— r2(l  -  r)  cos^sin - 


n 


1  o    ^1 


R 


12/ 


--5a— r(l  +  r)(2r  -  l)sine  +  Cq— r^(l  +  r)sin-. 
4         n  71  2 


cos  S  sin cos  -  sin  e  =  -  sin  - 

2  2  2 

cos  -  sin  a  —  cos  e  sin  -  =  sin  - 

sinfl  =  2sin  -  cos- 
2        2 


--/I/?— r(l-r)(2r  +  l)sine-^5Q— r(l  +  r)(2r-  l)sine 
An  An 

+^yl7^(2r  +  1)(1  -  r^)  sin  ^  +  Cq^  r2(l  +  r)  sin  ^ 
2        n  2  n  2 


1       Ri 


,Ri 


--57--^(2r-  1)(1  -  r^sin-  +  C/?-ir^(l  -  r)sin- 
Z        n  In  2 


IR\    .    9 

sin  - 

2  n         2 


{-A/3r(l-r)(2r+  l)cos- 


-BQr(l  +  r)(2r  -  l)cos-  +  A-i{2r  +  1)(1  -  r^) 
+2Car2(l  +  r)  -  57(2r  -  1)(1  -  r^)  +  2C/?r2(l  -  r)}. 

Now,  sin  f  7^  0  for  0  <  e  <  2x,  so  that  det  JF  7^  0  if  and  only  if  (?  7^  0, 
where 

Q     =    A-/{2r+l){l-r^)- B'r{2r-l){l-r^)  +  2Car^il  +  r) 

a 
+2C/9r2(l  _  r)  -  cos  -  [A0r{l  -  r)(2r  +  1)  +  Bar{l  +  r)(2r  -  1)] . 

We  compute 
Ay{2r  +  1)(1  -  r^)  =  (7^  +  a-r-Ri  +  ctl-^sj  [-2r3  -  r^  +  2r  +  1] 
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-Bt{2r  -  1)(1  -  r^)  =  (--jR^  -  (3f^Ri  -  Pl^s)  [-2r3  +  r^  +  2r  -  1] 

2Car^(l  +  r)=  (2qRo  +  2at-Ri  +  2a-y—s]  [r^  +  r^l 
\  n  n    J 

2Cl3r\l  -  r)  =  (2PR0  +  2/37^«i  +  ^^t^^)  l"''^  +  '"^l 

A0r{l  -  r)(2r  +  1)  =  ((3Ro  +  ^P^Ri  +  "Z'^^)  [-2'-^  +  r^  +  r] 

Bar(l  +  r)(2r  -  1)  =  (aRo  +  a0-Ri  +  a0—s]  [2r^  +  r^  -  r]. 

\  n  Ti    / 

Thus  we  get  the  following  expressions  for  the  various  terms  of  Q 

iZo-term  :  2[r^{a  -  P)  +  r^{a  +  P  -  f)  + -r]. 

^Ri-term:  7[q(1  +  r)^  + /3(1  -  r)^]. 

^s-term  :  7[q(1  +  rf  +  /9(1  -  rf]. 

cos  f-term  :  Ro[2r^ia  -  (i)  +  r'^{a  +  /3)  -  r(Q  -  /?)] 

+2-i2iQ/3r2  +  2^al3sr\ 
n  n 

Hence  we  may  rewrite  Q  as 

Q     =    2i2o[r^(a  - /?)  +  r2(Q  + /?  -  7)  +  7] 

+  ^i?i7[Q(l  +  rf  +  P{1  -  r)2]  +  :^57[a(l  +  r)^  +  /3(1  -  rf] 
-  cos  ^  {i?o[2r3(a  -  /?)  +  r2(a  +  /?)  -  r(a  -  /?)] 


+2-i2ia/3r2  +  2 
n 

=     m(r)5  +  6(r) 


n  J 


where 


m(r)  =  —    f  7Q  +  7/3  -  2  cos  -Q/3  j  r^  +  27(a  -  P)r  +  7(0  +  (3) 


and 

6(r)    = 


2i?o(a-/3)(l-cos-) 


+ 
+ 


2R^{a  +  /3  -  7)  +  -i?i7(a  +  /3)  -  cos|iZo(a  +  /?)  -  2cos^-Ria0 
n  L  In  , 


[2^i2i7(«-/3) 


e 

+  cos-iZo(a-/3) 


r  + 
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so  that 

6(r)  =  e(r)  +  km{r), 


where 


2(a  -  ^)(1  -  cos  -)r3  +  [2(a  +  /?  -  7)  -  (q  +  /?)  cos  -]r^ 
+[(a-/3)cos^]r  +  27|. 

So  we  have  obtained  the  following  expression  for  Q 

Q  =  m(r)s  +  e(r)  +  kTn{r). 

Now  Q{t,s)  vanishes  for  some  t,s  e  [-1,1]  precisely  when  either  of  the 
following  two  conditions  are  met. 

1.  3r  e  [-1, 1]  such  that  m(r)  =  6(r)  =  0 

2.  m(r)  5^0  and  5  =  -^€[-1,1]. 

Remark  1.1  We  consider  the  special  case  q  =  /3  =  7  =  1,  i.e.  the  case  of 
the  unperturbed  element.  Then 

m(r)     =     2:^[(l-cos|)r2  +  l] 

e(r)     =     2i?o[(l-cos^)r2  +  l] 

6(r)     =     2[{l-cos^)r' +  I]  (^B^  +  k^y 

Clearly  m{r)  >  0  for  all  r  e  3?  and 

2[(l-cosf)r2  +  l](i?o  +  |i2i) 


s     =     — 


2^[(l-cosf)r2  +  l] 


■(nf +  ^). 


Since  0  <  /:  <  n,  we  have 

k  +  n^>l,i{Ro>0, 
so  that  Q  does  not  vanish  on  [-1,1]  x  [-1, 1]. 
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Let  us  return  to  the  general  case.  We  proceed  in  two  steps.  First,  we 
show  how  to  ensure  that  m(r)  ^  0  for  r  €  [—1, 1].  Second,  we  show  how  to 
force  —b(r)/m{r)  to  lie  outside  of  [—1, 1]  for  r  €  [—1, 1]. 

(1)  First  we  force  the  coefficients  of  r^  in  m{r)  to  be  >  0.   (Observe  that 
this  is  the  case  when  a  =  /3  =  7  =  l).  Thus  we  want 

a 
7Q  +  7/?  —  2  cos  -a0  >  0 

or  equivalently 

(Q  +  /9)  ^^     e 

f- —  >  2 COS-. 

'      Q/3       -  2 


Let  g{a,P,'f)  =  'j^.  We  have 


n     -       "^ 

9a  = 2 

7 
^0  =  ~'02    (^y  symmetry) 

a  +  /3 
^'  =  -^- 

Let  C  denote  the  cube  in  ^  given  by 

C  =  {(a,/3,7)|  I  a- 1  I,  1/3-1  I,  I  7-1  |<  6} 

for  some  e  >  0  to  be  determined.  Qearly,  g  does  not  have  any  critical  points 
in  C,  so  to  find  the  minimum  of  j  on  C  it  is  enough  to  consider  the  boundary 
dC.  But  actually,  examining  the  restriction  of  g  to  the  six  faces  of  dC,  we 
see  that  it  suffices  to  consider  the  vertices  of  dC.  Moreover,  as  g{a,(3,j)  is 
obviously  symmetric  in  a  and  /?,  and 

gia,l3,l  +  €)>g{a,l3,l-€), 

we  only  have  three  vertices  to  consider. 

1.  a=  l-c,/3=  1  -e,7  =  l-€. 

g{a,P,t)  =  2  >  2cos-  for  any  0  <  ^  <  2;r 
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2.  a=  !  +  £,,/?  =  1  -f,,7  =  1-f- 


Thus  we  get,  provided  that  cos  5  ^  0» 


+  f 


e  <  sec 1. 

2 


3.  Q=  l  +  £,/3  =  l  +  e,7  =  l-€. 


5(",/3,7)  =  2 


1-f 


Thus  we  get 


€  < 


1  -  COS  I 
1  +  COS  f  ' 


Observe  that 


1  —  cos  I       1  —  cos  I 


< 


i  - 
2 


1  +  cos  7;  COS  t; 


-  =  5ec  -  —  1  if   COS  -  7^  0, 


and  thus  a  sufficient  condition  for  the  coefficient  of  r^  in  rn{r)  to  be  positive 
is 


1  a  -  1  I,  I  /3  -  1  I,  I  7  -  1  I  <  f,    where  e  <  f— ^. 

1  +  cos  I 


(1.2) 


Remark  1.3  We  want  f  <  1  !  This  will  follow  from  (1.2)  if  we  restrict  6  to 
lie  in  (0,7r). 

Now  that  the  coefficient  of  r^  is  positive,  the  critical  point  ro  of  m(r) 
yields  a  global  minimum  of  Tn{r)  on  3?.  We  have 


ro     = 


7(q-/?) 


Q7  +  /J7  -  2a(i  cos  | 


iZi 


m(ro)     =     — 


7(a  +  /?) 


7'(a  -  /3)2 


Q7  +  /?7  -  2a^  cos  f 


2  J 
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Thus 

7^(a 

m^r    ■»  -^i    n       ^ s       -tin    \     f{\                       /    V" 

-0? 

"7  +  P7  - 

■2Q/3cosf 

27               9 

Clearly  for  (a,/?, 7)  €  C,  we  have 

27     ^  1  -  f 

a  +  P  -  1  +  e 

and  we  know  that 

l  +  .>^°^2' 

>0 


so  that  we  may  conclude  that  m(r)  >  0  for  all  r  €  3?,  when  {a,f3,f)  €  C 
with  e  satisfying  equation  (1.2). 

(2)  Let  us  write  x  =  cos|.  We  now  choose 

We  want  to  find  a  condition  on  x  ensuring  that 

-^,^[-hl],ioTailre[-l,l]. 
m(r) 

Recall  that  6(r)  =  e(r)  +  fcm(r),  so  that 


m{r)  m(r) 

and  —k  <  —1  since  k  is  an  odd  integer  with  0  <  A:  <  n.  Thus  it  suffices  to 
show  that 

e(r)  >  0,  for  all  r€  [-1,1]- 

Now  e(r)  =  Rcp{r)  where 

p{r)  =  2(q  -  /3)(1  -  x^  +  [2(q  +  /3  -  7)  -  (a  +  l3)xy  +  (q  -  ^)ir  +  27. 

Let 

p    =     2(a  + /3  -  7)  -  (a  + /?)! 
H    =    2(q-/J)(1-i) 
*     =     {a-P)x 
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and  observe  that,  writing  p(r)  =  p(r,a,P,j),  we  have 

p(r,/3,a,7)  =  p(-r,a,/3,7), 

so  that,  to  show  p{r)  >  0  for  all  r  €  [-1,1]  and  for  all  (a,/?, 7)  e  C,  it  is 
enough  to  consider  r  €  [—1, 1]  and 

(a,/9,7)eC  =  {(a,/3,7)|a>/3}nC. 

When  Q  =  /?  we  have 

p{r)  =  2[a(2  -  x)  -  f]r^  +  27  >  27  >  0 

for  all  r  G  3J,  provided  q(2  -  x)  -  7  >  0.   But,  7<l  +  e,  q>1-£  and 
2  —  X  >  0,  so  it  suffices  to  show  that 

l  +  e<(l-e)(2-x), 

i.e. 

3-x 
By  our  choice  of  e,  (1.4),  we  have 

1 1-x       1-x 

- < <=!>   X  >  1/3 

2 1+x - 3-x  -    ' 

and  we  get 

e 

cos- >  1/3.  (1.5) 

We  have  shown  that,  when  a  =  /3, 

P(r)  >  27  >  2(1  -  £),  for  all  r  6  §f. 
We  now  show  that 

P  >  -^^    +^^-\  for(Q,/?,7)  e  C.  (1.6) 

1  +  X 

We  let  g{a,P,'/)  =  2(a  +  /?  -  7)  -  (a  +  /3)x,  and  we  have 

gc     =    2-x>0) 

gp     =    2-1     >     0>  for  values  of  Q,/?,  7  G  ». 

g^     =       -2       <     0  J 

It  follows  that  the  minimum  of  ^  on  C  occurs  at  the  vertices  of  C.  However 
g  is  symmetric  in  a,  (3  and 

«7(a,/?,7)>5(a,/?,l  +  e), 
so  there  are  only  three  vertices  to  consider. 
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1.  Q  =  l  +  €,^=  l-€,7=  1  +  f- 

p  =  2{l-x)-2€  (1.7) 

2.  a=  H-e,/3  =  l  +  e,7  =  1  + f • 

/)  =  2(l-i)-2€(l-i)  (1.8) 

3.  Q  =  1-t,^  =  l-€,7  =  1  +  f- 

/)  =  2(1  -  x)  -  2e(3  -  i).  (1.9) 

Now  for  0  <  ^  <  JT,  we  have 

l-i<l<3-i 

so  that 

(1.9)  <  (1.7)  <  (1.8). 

By  (1.4)  again,  we  have 

and  inequality  (1.6)  follows. 

The  roots  of  -3i^  +  4x  -  1  are  1  and  1/3,  so  that 

-3x2  +  4x  -  1  ^  ^^  ^^^  ^  ^^2  ^  ^  ^  ^ 


1  +  x 
Thus  we  have  shown  that  for  (a,/3,7)  G  C, 

>      '^'^    +^^ — i  >  0,  for  1/3  <  cos-  <  1. 
^-  l+x  2 

Moreover,  if  a  >  /3,  then 

0  <  /x  <  4e(l  -  x)   and  0  <  ^  <  2ex. 
But 

(l-l)2  ,  X(l-X) 

4€(1  -  x)  =  2^ '-   and  2ex  =  -V" '-• 

^  '  l+x  l+x 
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Thus,  for  Q  >  /?,  we  have 

p(r)  =  fxr^  +  pr^  +  6r  +  2-f>0,  for  all  r  €  [0, 1], 
so  it  suffices  to  consider  r  €  [-1,0].  Then  r^  <  0,  r  <  0,  so  that 

and 

p(r)>0    <;=>     2{l-xfr^  +  {-3x'^+4x-iy 
i(l -a:)r  + 2(1 -€)(l  +  a:)  >  0. 

N°"  11-x         1  +  3X 

1  -  e  =  1  -  - 


2  1  +  1       2(1 +  i)' 

so  that 

2(l  +  x)(l-€)=  l  +  3x, 

and  we  must  show  that 
2(1 -i)^r3  + (-3x2 +  4x- 1)7-2 +  i(l-x)r+l  +  3i  >  q,  V  r  G  [-1,0] 


i.e. 


(-3x2  ^.  4a.  _  1)^2  +  1  +  3x  >  -r[2(l  -  x)2r2  +  x(l  -  x)]. 
Since  -r  <  1  on  [-1,0],  it  is  enough  to  show  that 

2(1  -  x)2r2  +  x(l  -  x)  <  (-3x2  ^  4a.  _  1)^2  +  ^  ^.  33.   y  r  e  [-1,0]. 
This  will  certainly  be  the  case  if 

(a)  x(l  -x)  <  l  +  3x 

(b)  2(1 -x)2  <  -3x2  +  4x-l 

But  (a)  holds  if  and  only  if  (1  +  x)2  >  0,  which  holds  for  all  x  7^  —1. 

On  the  other  hand,  (b)  holds  if  and  only  if  -5x2  -|-  gx  -  3  >  0.  The  roots  of 

this  polynomial  are  1  and  3/5.  Thus  the  inequality  (b)  holds  for  all 

3/5  <  i<  1. 


61 


To  summarize,  we  have  proved  theore    10.1,  namely  that  the  jacobian 
detJF  does  not  vanish  for  all  a,^,f  and  0  such  that 


1  1  -  cos  f 


|a-l  1,1/3-1  |,l7-l  l<^7- j 

^  1  +  cos  § 

and 

0 
3/5  <  cos-  <  1. 
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